'Summarise proportions of treatment response by two subgroups
I generated the following random data which has the same structure as my real data:
data <- structure(list(gender = structure(c(1L, 2L, 1L, 2L, 2L, 1L, 1L,
1L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L,
2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L,
2L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 2L,
1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L), .Label = c("Female",
"Male"), class = "factor"), treatment_response = structure(c(2L,
2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 2L,
1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L,
2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L,
1L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L,
1L, 2L, 1L), .Label = c("Non-responder", "Responder"), class = "factor"),
country = structure(c(1L, 3L, 9L, 7L, 2L, 2L, 6L, 6L, 4L,
9L, 5L, 7L, 3L, 2L, 8L, 8L, 10L, 3L, 1L, 1L, 9L, 2L, 5L,
5L, 2L, 7L, 5L, 1L, 6L, 10L, 7L, 10L, 2L, 10L, 1L, 4L, 6L,
8L, 9L, 4L, 3L, 6L, 1L, 8L, 7L, 3L, 2L, 10L, 7L, 6L, 1L,
9L, 8L, 4L, 8L, 7L, 3L, 5L, 3L, 4L, 7L, 4L, 8L, 4L, 5L, 6L,
8L, 1L, 7L, 5L, 8L, 1L, 7L, 10L, 8L, 1L, 9L, 8L, 6L, 6L,
10L, 7L, 3L, 6L, 5L, 10L, 2L, 1L, 9L, 5L, 5L, 10L, 2L, 6L,
10L, 4L, 8L, 7L, 9L, 8L), .Label = c("Switzerland", "Czech Republic",
"Denmark", "Iceland", "Netherlands", "Norway", "Portugal",
"Romania", "Sweden", "Finland"), class = "factor")), class = "data.frame", row.names = c(NA,
-100L))
This is how it looks like
'data.frame': 100 obs. of 3 variables:
$ gender : Factor w/ 2 levels "Female","Male": 1 2 1 2 2 1 1 1 2 1 ...
$ treatment_response: Factor w/ 2 levels "Non-responder",..: 2 2 1 2 1 1 2 2 2 1 ...
$ country : Factor w/ 10 levels "Switzerland",..: 1 3 9 7 2 2 6 6 4 9 ...
I wish to create a data.frame where the treatment_response is summarised by gender and country in percentages or proportions. For example, 60% (or 0.6) of males in Switzerland are responders and 40% are non-responder, similarly for females, etc.
I am familiar with the existence of dplyr and have managed to find a method to do this for mean but not for proportion or percentages.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
