'How to make barplot of groups in dataframes?

For this data:

class <- c(1, 2, 3, 2, 1, 4, 5, 4, 2, 4) 
prog <- c("Bac2", "Bac", "Master", "Bac", "Bac", "DEA", "Doctorat", "DEA", "Bac", "DEA")
  mydata <- data.frame(height = class, prog)

I want to make a plot like this. for example,

   all corresponding to bac2 is 1  so it is 100% of 1
   all corresponding to bac are 2,2,1,2 so it is 75% of 2 and 25% of 1


  mydata=structure(list(height = c(1, 2, 3, 2, 1, 4, 5, 4, 2, 4), prog = 
 c("Bac2", 
"Bac", "Master", "Bac", "Bac", "DEA", "Doctorat", "DEA", "Bac", 
"DEA")), class = "data.frame", row.names = c(NA, -10L))
r


Solution 1:[1]

class <- c(1, 2, 3, 2, 1, 4, 5, 4, 2, 4) 
prog <- c("Bac2", "Bac", "Master", "Bac", "Bac", "DEA", "Doctorat", "DEA", "Bac", "DEA")
mydata <- data.frame(height = class, prog)
require(dplyr)
require(ggplot2)
require(forcats)

mydata %>% group_by(prog,height) %>% 
  tally() %>% mutate(prop = n/sum(n)) %>% 
  ggplot(aes(x=prog, y=prop, fill=fct_rev(as.factor(height))))+
  geom_col() +
  scale_x_discrete(labels=c('Bac2','Bac','Master', 'DEA','Doctorat'))+
  scale_y_continuous(labels = scales::percent)+
  theme(legend.position = 'null')

Created on 2022-05-10 by the reprex package (v2.0.1)

Solution 2:[2]

A succinct way using table and proportions first, then adapting the lengths to be able to create a matrix, order by max, and finally barplot.

p <- with(mydata, tapply(height, prog, \(x) proportions(table(x))))
lapply(p[order(-sapply(p, max))], `length<-`, max(lengths(p))) |>
  do.call(what=rbind) |> t() |> barplot(col=3:6)

enter image description here

Solution 3:[3]

Here is a way.
Pre-compute the levels of prog so that "Bac2" comes before "Bac", like in the posted drawing, and how many unique height values are in the data to have the bars white.
Then plot the bars with position = "fill".

suppressPackageStartupMessages({
  library(dplyr)
  library(ggplot2)
})

mydata=structure(list(height = c(1, 2, 3, 2, 1, 4, 5, 4, 2, 4), prog = 
                        c("Bac2", 
                          "Bac", "Master", "Bac", "Bac", "DEA", "Doctorat", "DEA", "Bac", 
                          "DEA")), class = "data.frame", row.names = c(NA, -10L))

levs <- unique(mydata$prog)
nheight <- n_distinct(mydata$height)


mydata %>%
  mutate(prog = factor(prog, levels = levs)) %>%
  ggplot(aes(prog, fill = factor(height))) +
  geom_bar(position = "fill", colour = "black", show.legend = FALSE) +
  geom_text(aes(label = height), 
            stat = "count", 
            position = position_fill(vjust = 0.5)) +
  scale_fill_manual(values = rep("white", nheight)) +
  scale_y_continuous(labels = scales::percent)

Created on 2022-05-10 by the reprex package (v2.0.1)


Edit

y axis scale changed to a percent scale.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 YH Jang
Solution 2 jay.sf
Solution 3