'How to I get wide data tidied into the proper format for a stacked bar facet_grid plot?

I'm trying to get data into a 3x3 facet_grid plot, but am struggling to find the right tidy combination to make it work.

I can manage to get a single category to facet like so:

# ingest some data
df <- structure(list(Q52_bin = structure(c(3L, 2L, 2L, 2L, 2L, 2L), .Label = c("low", 
"medium", "high"), class = "factor"), Q53_bin = structure(c(2L, 
3L, 2L, 2L, 2L, 2L), .Label = c("low", "medium", "high"), class = "factor"), 
    Q57_bin = structure(c(2L, 2L, 2L, 2L, 2L, 2L), .Label = c("low", 
    "medium", "high"), class = "factor"), Q4 = c("A little", 
    "Some", "Some", "A great deal", "A lot", "Some")), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))
# Now let's try and develop a faceted plot using the low/med/high bins we've created above under political_lr, spirituality etc.

# make column names coherent and simplified
names(df) <- c("Q52_bin", "Q53_bin", "Q57_bin", "response")
# filter out NA values
df <- filter(df, !is.na(response))
# generate new dataframe with sums per category and sort in descending order
df <- df %>%
  dplyr::count(response, Q52_bin, sort = TRUE) %>% 
  dplyr::mutate(response = forcats::fct_rev(forcats::fct_inorder(response))) 
# make plot
ggplot(df, aes(x = n, y = response)) +
  geom_col(colour = "white") + facet_grid(rows = vars(Q52_bin)) +
  ## reduce spacing between labels and bars
  scale_x_continuous(expand = c(.01, .01)) +
  scale_fill_identity(guide = "none") +
  ## get rid of all elements except y axis labels + adjust plot margin
  theme_ipsum_rc() +
  theme(plot.margin = margin(rep(15, 4))) +
  easy_center_title()

I've filtered out columns using count() aside from Q52_bin. In order to get the proper set, I believe I'll need to use pivot_longer(), in somethign like this:

# Now let's try and add in rows to represent other kinds of faceting in a 3x3 visualisation
df <- select(climate_experience_data_named, Q52_bin, Q53_bin, Q57_bin, Q4)
# make column names coherent and simplified
names(df) <- c("Q52_bin", "Q53_bin", "Q57_bin", "response")
# filter out NA values
df <- filter(df, !is.na(response))
# generate new dataframe with sums per category and sort in descending order
# commenting out percentages and labelling in plot as this will need to be handled differently in facets
# for additional faceted columns to work, we need to convert this data to long format so that bin data is integrated into counts
df <- df %>%
  pivot_longer(!response, names_to = "bin_name", values_to = "b")
df <- df %>%
  dplyr::count(response, bin_name, sort = TRUE)

# Broken plot!
ggplot(df, aes(x = bin_name, y = n)) +
  geom_col(colour = "white", stat='identity') + facet_grid(rows = vars(?), cols = vars(bin_name))

The goal is to have facet rows as shown above at "low" "medium" and "high" with columns following "Q52_bin", "Q53_bin", and "Q57_bin" and stacked bars inside of each for the likert style factors this data comes as. And I believe that this is breaking at the point I'm using count() here. But I can't seem to sort out how to reconfigure. And the plot, obviously, isn't getting anywhere either. I suspect this is just an easy tweak, but seems to be beyond me!



Solution 1:[1]

I'm not sure I've exactly understood your final desired plot, but I think from your original df you can do this:

names(df) <- c("Q52_bin", "Q53_bin", "Q57_bin", "response")
df %>% 
  pivot_longer(!response, names_to = "bin_name", values_to = "b") %>% 
  count(response, bin_name, b) %>% 
  ggplot(aes(x=n,y=response)) +
  geom_col(color='white') + 
  facet_grid(vars(bin_name), vars(b))

Output:

likert bar plot

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 langtang