'Ignoring NA_real_ when calculating with group_by
I need to do some calculations regarding rows with a same ID (mean, max and sum)
For instance, from all individuals in a same family, which one has the greatest education level:
yst = Years of study of this individual
idf = Family identification number for this individual
maxyst = Years of study of this most educated person in this individual's family
I tried to use mutate() and group_by(), but when there is one NA_real_ row the whole group gets NA. What I want to do is to ignore any idf in which ALL rows are is.na(yst). But if I have at least one !is.na(yst), I wish to ignore all the NA and keep the calculation within the valid rows in that idf.
This is my current code:
df %<>%
mutate(df, yst = anosest) %>%
mutate(df, yst = case_when(
is.na(anosest) ~ 0,
TRUE ~ yst)) %>%
group_by(idf) %>%
mutate(maxyst = max(yst)) %>%
ungroup()
Tried a lot of different approaches using a combination of logical values, but I couldn't figure it out.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
