'summarize count with a condition

I have a data frame of counts in quadrats for a bunch of species in a bunch of years, but sometimes instead of counts, they are marked as "p" for "present." I want to average them while counting those p's as NA in the averaging, but also keep track of the number of p's there are in each species/year, so my question is, is there a way to use summarize(count) to count occurrences of P?

minimal example:

df <- data.frame(
  # years
  year = rep(1990:1992, each=3),
  # character vector of counts and p's
  count = c("p","p","2","1","5","4","7","p","4")
) %>%
  # numeric column of counts and NAs where P's should be
  mutate(count_numeric = as.numeric(count))


# summarize dataset
df %>%
  group_by(year) %>%
  summarize(number_quadrats = n(), # find total number of rows
            average_count = mean(count_numeric, na.rm=T)) # find average value

but I want to add another line to the summarize that will just count the number of P's in each group. Something like this:

df %>%
  group_by(year) %>%
  summarize(number_quadrats = n(), # find total number of rows
            average_count = mean(count_numeric, na.rm=T),# find average value
            number_p = n(count == "p"))

but that doesn't work.

Any advice appreciated.

Thanks!



Solution 1:[1]

Something like this!

df %>%
  group_by(year) %>%
  summarize(N = n(), number_quadrats = sum(count == 'p'),
            average_count = mean(count_numeric, na.rm=T)) 
  year     N number_quadrats average_count
  <int> <int>           <int>         <dbl>
1  1990     3               2          2   
2  1991     3               0          3.33
3  1992     3               1          5.5 

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 TarJae