'I wonder how to create a proportional table for one categorical variable and one numerical variable (numerical data is proportion)?

library(ggplot2)
library(tidyverse)
library(dplyr)
aqi <- read.csv("aqi12_21.csv")
aqi <- select(aqi,State.Name,county.Name,Date,AQI,Category,Defining.Parameter)
aqi <- rename(aqi,State=State.Name,County=county.Name)
aqi <- separate(aqi, Date, c("Year", "Month", "Day"))
AQI_HIGH<-filter(aqi,AQI>100)
average_aqi_state <- AQI_HIGH %>% group_by(State) %>% summarise(average_aqi = mean(AQI))

So I have my average data which looks like:

enter image description here

I don't know how to create a proportional graph (average aqi is shown in percentage) while the state remains categorical variable

r


Solution 1:[1]

Suppose this simplified form of data represents your actual data:

dat <- structure(list(State = c("Alabama", "Alaska", "Arizona", "Others"
), average_aqi = c(300, 550, 150, 1000)), class = "data.frame", row.names = c(NA, 
-4L))

If I understand your purpose correctly, you want to get the proportion of average_aqi in this way:

dat |> mutate(avaqi_perc = average_aqi/sum(average_aqi))

#    State average_aqi avaqi_perc
#1 Alabama         300      0.150
#2  Alaska         550      0.275
#3 Arizona         150      0.075
#4  Others        1000      0.500

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Abdur Rohman