'How can I group_by "x or greater" on an integer column and return the "group" integer and other values from that row?
I have a dataset of weightlifting data, and I'm trying to write some simple code using dplyr to find the maximum weight lifted at 1, 2, 3, 4, etc reps.
The problem is this: let's say that the data indicates the lifter lifted 385 for three reps and 365 for two reps. If I simply group the lifts by number of reps, then for the 2-rep max, I will get 365. But of course if you lift 385 three times, you also did it two times.
Here's a reproducible dataset:
# Reproducible data set
Date <- c("2016-07-09", "2016-07-10", "2016-07-11")
Exercise.Name <- c("Squat", "Squat", "Squat")
Weight <- c(405, 365, 385)
Reps <- c(1, 2, 3)
small_data <- data.frame(Date, Exercise.Name, Weight, Reps)
small_data$Date <- as.Date(small_data$Date)
Currently, this is the function I've defined to find all of the X-rep maxes:
find_reps <- function(df, exercise){
df %>% filter(Exercise.Name == exercise) %>%
group_by(Reps) %>% filter(Weight == max(Weight)) %>%
select(Reps, Date, Weight) %>% arrange(Reps, desc(Date))
}
find_reps(small_data, "Squat")
When I run the function I get this output:
Reps Date Weight
<dbl> <date> <dbl>
1 1 2016-07-09 405
2 2 2016-07-10 365
3 3 2016-07-11 385
Original question: How could I rewrite this code to return 385 for both 2 and 3 reps?
Follow up:
- One of the answers to my original question helped me realize that it's important not only to get 385 to return for the 2-rep max in the example above, but that it's associated with the correct session (Date).
- I have realized that what I'm really trying to do is end up with:
A categorical variable for the "Rep Category" (e.g., 1, 2, and 3-rep max)
A different variable for the actual reps lifted (e.g., 1, 3, 3)
The actual date on which the max lift was executed.
So what I'm really trying to end up with is something like this:
Exercise.Name RepCategory MaxWeight ActualReps DateOfPr
1 Squat 1 405 1 2016-07-09
2 Squat 2 385 3 2016-07-11
3 Squat 3 385 3 2016-07-11
Solution 1:[1]
Here I join the input data (filtered for the target exercise) to themselves with a left_join(), allowing combination of data from multiple "rep groups" instead of group_by(Reps) as you did above
library(dplyr)
# Reproducible data set
Date <- c("2016-07-09", "2016-07-10", "2016-07-11")
Exercise.Name <- c("Squat", "Squat", "Squat")
Weight <- c(405, 365, 385)
Reps <- c(1, 2, 3)
small_data <- data.frame(Date, Exercise.Name, Weight, Reps)
small_data$Date <- as.Date(small_data$Date)
find_reps <- function(df, exercise) {
df2 <- df %>%
filter(Exercise.Name == exercise) %>%
mutate(df, .ID = 1)
df2 %>%
left_join(df2, by = ".ID") %>%
group_by(Reps.x) %>%
filter(Reps.y >= Reps.x) %>%
mutate(MaxWeight = max(Weight.y)) %>%
filter(Weight.y == MaxWeight) %>%
select(Exercise.Name.x, Reps.x, MaxWeight, Reps.y, Date.y) %>%
rename(Exercise.Name = Exercise.Name.x, RepCategory = Reps.x, ActualReps = Reps.y, DateOfPr = Date.y)
}
find_reps(small_data, "Squat")
#> # A tibble: 3 × 5
#> # Groups: RepCategory [3]
#> Exercise.Name RepCategory MaxWeight ActualReps DateOfPr
#> <chr> <dbl> <dbl> <dbl> <date>
#> 1 Squat 1 405 1 2016-07-09
#> 2 Squat 2 385 3 2016-07-11
#> 3 Squat 3 385 3 2016-07-11
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Andrew Brown |
