'Group by and summarise
I want to group according to 3 variables and create new variables with the summarise function.
My code:
Option 1
library(tidyverse)
library(dplyr)
example2<-example%>%
group_by(age_cohort,sex,city)%>%
summarise(rich=sum(rich),
middleclass=sum(middleclass),
poor=sum(poor),
population=count(id))
I don't understand the error:
Error in `summarise()`:
! Problem while computing `population = count(id)`.
i The error occurred in group 1: age_cohort = 1, sex = 0, city = 1.
Caused by error in `UseMethod()`:
! no applicable method for 'count' applied to an object of class "c('double', 'numeric')"
Run `rlang::last_error()` to see where the error occurred.
Option 2
example3<-example%>%
group_by(age_cohort,sex,city)%>%
summarise(rich=sum(rich),
middleclass=sum(middleclass),
poor=sum(poor),
population=n(id))
The error:
Error in `summarise()`:
! Problem while computing `population = n(id)`.
i The error occurred in group 1: age_cohort = 1, sex = 0, city = 1.
Caused by error in `n()`:
! unused argument (id)
Run `rlang::last_error()` to see where the error occurred.
Furthermore, if I delete the 'population' variable I still having problems with my code.
New code
example<-example%>%
group_by(age_cohort,sex,city)%>%
summarise(rich=sum(rich),
middleclass=sum(middleclass),
poor=sum(poor))
The error:
Error in UseMethod("group_by") :
no applicable method for 'group_by' applied to an object of class "function"
The original data (example):
id sex city rich middleclass poor age_cohort
1 0 1 1 0 0 1
2 1 1 0 1 0 5
3 1 2 0 0 1 2
4 0 2 0 0 1 3
5 1 3 0 0 1 4
6 0 4 0 1 0 1
7 0 6 0 1 0 1
8 1 7 1 0 0 5
9 0 3 1 0 0 5
10 1 7 0 1 5
11 1 3 0 0 1 2
12 1 1 0 0 1 3
Solution 1:[1]
As akrun said, you need population=n().
id <- 1:12
sex <- c(0,1,1,0,1,0,0,1,0,1,1,1)
city <- c(1,1,2,2,3,4,6,7,3,7,3,1)
rich <- c(1,0,0,0,0,0,0,1,1,0,0,0)
middleclass <- c(0,1,0,0,0,1,1,0,0,1,0,0)
poor <- c(0,0,1,1,1,0,0,0,0,NA,1,1)
age_cohort <- c(1,5,2,3,4,1,1,5,5,5,2,3)
example <- data.frame(id,sex,city,rich,middleclass,poor,age_cohort)
example3 <- example%>%
group_by(age_cohort,sex,city)%>%
summarise(rich=sum(rich),
middleclass=sum(middleclass),
poor=sum(poor),
population=n())
Output
> example
id sex city rich middleclass poor age_cohort
1 1 0 1 1 0 0 1
2 2 1 1 0 1 0 5
3 3 1 2 0 0 1 2
4 4 0 2 0 0 1 3
5 5 1 3 0 0 1 4
6 6 0 4 0 1 0 1
7 7 0 6 0 1 0 1
8 8 1 7 1 0 0 5
9 9 0 3 1 0 0 5
10 10 1 7 0 1 NA 5
11 11 1 3 0 0 1 2
12 12 1 1 0 0 1 3
> example3
# A tibble: 11 x 7
# Groups: age_cohort, sex [7]
age_cohort sex city rich middleclass poor population
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
1 1 0 1 1 0 0 1
2 1 0 4 0 1 0 1
3 1 0 6 0 1 0 1
4 2 1 2 0 0 1 1
5 2 1 3 0 0 1 1
6 3 0 2 0 0 1 1
7 3 1 1 0 0 1 1
8 4 1 3 0 0 1 1
9 5 0 3 1 0 0 1
10 5 1 1 0 1 0 1
11 5 1 7 1 1 NA 2
Why you got errors
As others have pointed out in the comments.
First error was due to count working on dataframes and variable names; it cannot be used as a summary function. For example, count(example, sex). You gave count a numeric vector (an object of class "c('double', 'numeric')), which it couldn't take as argument (no applicable method for 'count' applied to...).
Second error was due to n() only returning information about the last grouping variable (see ?context). This time, you gave it an argument while it doesn't take any since the last grouping variable is specified by group_by, so it returned unused argument.
The last error was due to you not creating an object example in the environment before executing group_by. Indeed, example is the name of a function in utils (see ?example). So if you don't create an object with that name, R thinks you're referring to the function named example. Then you try to group it, which R can't because it only works on dataframes. You gave it an argument of class function (an object of class "function") when it expects a dataframe.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Valkyr |
