'Get percentage of values for column grouped by other column - R
If I have a table like,
| Division | Color |
|---|---|
| A | Red |
| A | Blue |
| A | Blue |
| A | Yellow |
| B | Blue |
| B | Yellow |
| C | Green |
And I want to find the percentage of colors based on each division, so the output should look like,
| Division | Red | Blue | Yellow | Green |
|---|---|---|---|---|
| A | 25.0 | 50.0 | 25.0 | 0 |
| B | 0 | 50.0 | 50.0 | 0 |
| C | 0 | 0 | 0 | 100.0 |
How can I do this in R?
Solution 1:[1]
You can do
tab <- table(df$Division, df$Color)
100 * tab / rowSums(tab)
#> Blue Green Red Yellow
#> A 50 0 25 25
#> B 50 0 0 50
#> C 0 100 0 0
Data in reproducible format
df <- structure(list(Division = c("A", "A", "A", "A", "B", "B", "C"
), Color = c("Red", "Blue", "Blue", "Yellow", "Blue", "Yellow",
"Green")), class = "data.frame", row.names = c(NA, -7L))
Solution 2:[2]
This could be another approach using janitor and would be nice to learn about:
library(janitor)
df %>%
tabyl(Division, Color) %>%
adorn_percentages() %>%
adorn_pct_formatting()
Division Blue Green Red Yellow
A 50.0% 0.0% 25.0% 25.0%
B 50.0% 0.0% 0.0% 50.0%
C 0.0% 100.0% 0.0% 0.0%
Solution 3:[3]
Using proportions.
proportions(table(df), margin=1)*100
# Color
# Division Blue Green Red Yellow
# A 50 0 25 25
# B 50 0 0 50
# C 0 100 0 0
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Allan Cameron |
| Solution 2 | Anoushiravan R |
| Solution 3 | jay.sf |
