'How to filter a df where every instance of an element has a value > x in another column (same row)? R
I am trying to sort a df where every instance of each value in the subject column has to be above a certain value.
I want a df with only those subjects with values in values > 0.5 in every instance.
Example df:
df1 <- data.frame(subject = c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5),
values = c(.4, .6, .6, .6, .6, .6, .6, .6, .6, .4))
> df1
subject values
1 1 0.4
2 2 0.6
3 3 0.6
4 4 0.6
5 5 0.6
6 1 0.6
7 2 0.6
8 3 0.6
9 4 0.6
10 5 0.4
would produce:
> df1
subject values
1 2 0.6
2 3 0.6
3 4 0.6
4 2 0.6
5 3 0.6
6 4 0.6
Thanks.
Solution 1:[1]
You can use all() inside a grouped filter using dplyr:
library(dplyr)
df1 %>%
group_by(subject) %>%
filter(all(values > .5)) %>%
ungroup()
Output:
# A tibble: 6 x 2
subject values
<dbl> <dbl>
1 2 0.6
2 3 0.6
3 4 0.6
4 2 0.6
5 3 0.6
6 4 0.6
Solution 2:[2]
Using min in ave.
df1[with(df1, ave(values, subject, FUN=min)) > .5, ]
# subject values
# 2 2 0.6
# 3 3 0.6
# 4 4 0.6
# 7 2 0.6
# 8 3 0.6
# 9 4 0.6
Solution 3:[3]
a data.table approach
library(data.table_)
setDT(df1)[, .SD[all(values == 0.6) == TRUE], by = .(subject)][]
# subject values
# 1: 2 0.6
# 2: 2 0.6
# 3: 3 0.6
# 4: 3 0.6
# 5: 4 0.6
# 6: 4 0.6
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | jay.sf |
| Solution 3 |
