'How to find the maximum value within each group and then recode all other values in the group as zero?
I have a data frame with the following simplified structure:
df <- data.frame(Id = c(1,1,1,2,2,2,3,3,3,4,4,4),
value = c(500,500,500,250,250,250,300,300,300,400,400,400))
and I am trying to get the following desired output:
df$maxByGroup <- c(500,0,0,250,0,0,300,0,0,400,0,0)
I have tried this:
df$Id <- as.factor(df$Id)
newDf <- df %>%
group_by(Id) %>%
summarise(maxByGroup = sum(max(value)))
and just get the maximum of 500 returned.
I have looked at other solutions that get the max value easily enough but I cannot find one that gives the max value and returns 0 for the other values within each group.
The most important aspect of the desired output is I want to maintain the data structure but have the first observation within each group to reflect the maximum and the rest to be recoded as zero. Any help that anyone could provide would be very much appreciated.
Solution 1:[1]
Could just arrange on value and set the first id to value and rest to 0:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
data.frame(Id = c(1,1,1,2,2,2,3,3,3,4,4,4),
value = c(500,500,500,250,250,250,300,300,300,400,400,400)) %>%
group_by(Id) %>%
arrange(desc(value)) %>%
mutate(
maxByGroup = if_else(row_number() == 1, value, 0)
)
#> # A tibble: 12 x 3
#> # Groups: Id [4]
#> Id value maxByGroup
#> <dbl> <dbl> <dbl>
#> 1 1 500 500
#> 2 1 500 0
#> 3 1 500 0
#> 4 4 400 400
#> 5 4 400 0
#> 6 4 400 0
#> 7 3 300 300
#> 8 3 300 0
#> 9 3 300 0
#> 10 2 250 250
#> 11 2 250 0
#> 12 2 250 0
Created on 2022-01-31 by the reprex package (v2.0.0)
Solution 2:[2]
# calculate max and intra group row id
df[, `:=` (max_value = max(value)
, dummy_row_id = 1:.N
)
, Id
]
# cast rows other than 1st intra group as 0
df[dummy_row_id > 1, max_value := 0][, dummy_row_id := NULL]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Sweepy Dodo |
