'r summary by group and within groups
If this is my datatset, arranged by Subject and Test
ID Subjects Test Score Results
1 English 1 78 Pass
2 English 1 98 Pass
2 English 2 81 Pass
3 English 2 81 Pass
2 English 3 15 Fail
3 English 3 74 Pass
4 Physics 1 34 Fail
2 Physics 1 79 Pass
4 Physics 2 74 Fail
3 Physics 2 81 Pass
3 Physics 2 81 Pass
4 Physics 3 48 Fail
2 Physics 3 15 Fail
3 Physics 3 74 Pass
I am interested in creating summaries like this
Test1 Test2 Test3
Subject FailAverge %Fail FailAverge %Fail FailAverge %Fail
English 0 0 0 0 15 50
Physics 34 50 74 33% 31.5 66
- The summaries grouped by Test attempt(1,2,3)
- Summaries for each subject
- During each test attempt, % failed and average score of those who failed during that attempt. For example during Test attempt 3 & Physics, Two students failed out of Three so %fail is (2/3)*100 and average score among those who failed is (48+15)/2
Any help is much appreciated, Thanks.
Solution 1:[1]
I took an attempt using tidyverse principles. To get that extact format you will probably need some table package (eg. GT) but the below gets you close.
I summarized the data into a new data frame then used the pivot wider to get the rows to columns and lastly did some minor tidying.
#recreate the table
df <- tribble(
~ID, ~Subjects, ~Test, ~Score, ~Results,
1, "English", 1, 78, "Pass",
2, "English", 1, 98, "Pass",
2, "English", 2, 81, "Pass",
3, "English", 2, 81, "Pass",
2, "English", 3, 15, "Fail",
3, "English", 3, 74, "Pass",
4, "Physics", 1, 34, "Fail",
2, "Physics", 1, 79, "Pass",
4, "Physics", 2, 74, "Fail",
3, "Physics", 2, 81, "Pass",
3, "Physics", 2, 81, "Pass",
4, "Physics", 3, 48, "Fail",
2, "Physics", 3, 15, "Fail",
3, "Physics", 3, 74, "Pass")
#create table to summarize the grouped data
df_fail <- df %>%
group_by(Subjects,Test) %>%
summarize(FailAverage=mean(Score[Results=="Fail"]),
Failper=mean(Results=="Fail",na.rm=TRUE))
#pivot wider the values, arrange the columns in order and then did some renaming
df_fail %>% pivot_wider(names_from = c(Test),
values_from = c(FailAverage,Failper)) %>%
relocate(Subjects,contains("1"),contains("2"),contains("3")) %>%
rename_with(.cols = c(-Subjects),.fn = ~gsub("_", "_test", .x))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
