'Count first occurence of a dummy (grouped by) in R and then sum
structure(list(id = c(1L, 1L, 2L, 3L, 3L, 3L, 4L), hire_year = c(2017L,
2017L, 2017L, 2017L, 2016L, 2014L, 2016L), dummy = c(0L, 0L,
1L, 0L, 0L, 0L, 1L)), class = "data.frame", row.names = c(NA,
-7L))
id hire_year dummy
1 1 2017 0
2 1 2017 0
3 2 2017 1
4 3 2017 0
5 3 2016 0
6 3 2014 0
7 4 2016 1
I would like to count the number of rows for which the dummy equals 0. However, I would like each id to make the count only once, even though for the same id I may have more than one row with the dummy equaling 0. Here I would expect the output to be [2].
Solution 1:[1]
You may use distinct to keep only unique rows then count number of 0's.
df %>%
distinct(id, .keep_all = TRUE) %>%
summarise(dummy = sum(dummy == 0))
# dummy
#1 2
Solution 2:[2]
length(unique(df$id[df$dummy==0]))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Ronak Shah |
| Solution 2 | langtang |
