'custom function applied to groups individually R

imagine a function that sums then divides if specific conditions are met in various columns. now try it out, it works nicely.

sum(df[which(df$x %in% c("1", "2", "3") | df$y %in% c("1", "2", "3")), z])/sum(df$z)

now imagine grouping by another column and trying to get that function to work within those groups. in my case, when i try this, it doesnt work!

df %> group_by(a) %>% sum(df[which(df$x %in% c("1", "2", "3") | df$y %in% c("1", "2", "3")), 4])/sum(df$z)

the answer i am getting is the calculation across the entire df, listed for all of a.

what i need is the answer within each of a's groups.

i don't know how to ask exactly, but is there some way to get the first function to run for all instances of each of the grouping elements in column a?

thank you

r


Solution 1:[1]

After grouping by 'a', the df$ will get the entire column value. Instead, it would be within each group, i.e remove the df$ and also use summarise to do the computation

library(dplyr)
df %>%
   group_by(a) %>%
   summarise(Prop = sum(z[x %in% 1:3|y %in% 1:3])/sum(z))

If there are many columns, then use if_any

df %>%
   group_by(a) %>%
   summarise(Prop = sum(z[if_any(c(x, y),  ~.x %in% 1:3)])/sum(z))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1