'Need to find which players played in which year(s) using R
I have baseball data and need to find which players --
- played in 1945 but not 1946.
- did not play in either year.
- played in both 1945 and 1946.
For #1, the desired output would be --
player
Albert
Barnes
For #2, the desired output would be --
player
Andrews
David
For #3, the desired output would be --
player
Baker
Frank
I would prefer a dplyr solution but am open to others. I could not find a solution in stackoverflow that matched my situation. If one exists, I would appreciate it if you could share its link.
Dput data sample
structure(list(player = c("Albert", "Andrews", "Baker", "Charles",
"Baker", "David", "Frank", "Barnes", "Ross", "Frank", "Frank"
), year = c(1945, 1944, 1946, 1946, 1945, 1947, 1945, 1945, 1946,
1946, 1947)), class = "data.frame", row.names = c(NA, -11L))
Solution 1:[1]
This should do it:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
dat <- structure(list(player = c("Albert", "Andrews", "Baker", "Charles",
"Baker", "David", "Frank", "Barnes", "Ross", "Frank", "Frank"
), year = c(1945, 1944, 1946, 1946, 1945, 1947, 1945, 1945, 1946,
1946, 1947)), class = "data.frame", row.names = c(NA, -11L))
dat %>%
group_by(player) %>%
filter(1945 %in% year & ! 1946 %in% year) %>%
select(player)
#> # A tibble: 2 × 1
#> # Groups: player [2]
#> player
#> <chr>
#> 1 Albert
#> 2 Barnes
dat %>%
group_by(player) %>%
filter(! 1945 %in% year & ! 1946 %in% year) %>%
select(player)
#> # A tibble: 2 × 1
#> # Groups: player [2]
#> player
#> <chr>
#> 1 Andrews
#> 2 David
dat %>%
group_by(player) %>%
filter(all(c(1945, 1946) %in% year)) %>%
select(player) %>%
distinct()
#> # A tibble: 2 × 1
#> # Groups: player [2]
#> player
#> <chr>
#> 1 Baker
#> 2 Frank
Created on 2022-05-03 by the reprex package (v2.0.1)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | DaveArmstrong |

