'Check if a value among a list exist in multiple column R data.table
Problem
I have a dataset with a lot of identical kind of variables which can contain the same values. I would like to check if among those variable, we can find at list one value among a list of values.
Example
Let say that we have a dataset with 3 variables of factor type DAS1
, DAS2
, DAS3
.
The possible values for those variables are c("0", "1", "x", "y")
(note that I'm not trying to differentiate numbers and letters. Consider every values as characters).
library(data.table)
start <- data.table::data.table(DAS1 = c("0","1","x","0","1","0","1"),
DAS2 = c("x","y","0","0","x","1","0"),
DAS3 = c("1","1","y","1","x","y","0"))
My objective is to find which row contain at least one observation of the values "x"
or "y"
.
result <- data.table::data.table(DAS1 = c("0","1","x","0","1","0","1"),
DAS2 = c("x","y","0","0","x","1","0"),
DAS3 = c("1","1","y","1","x","y","0"),
xy = c(T,T,T,F,T,T,F))
Condition
I really want to do it with the data.table
package and not dplyr
because I mostly use data.table
and I don't like to switch between the two packages if it's not necessary.
Answer by @lovalery
start[, xy := apply(start[,c("DAS1", "DAS2", "DAS3")],1, function(x) any(x %in% c("x", "y")))][]
Solution 1:[1]
You can try this
> start[, xy := rowSums((.SD == "x") + (.SD == "y")) > 0][]
DAS1 DAS2 DAS3 xy
1: 0 x 1 TRUE
2: 1 y 1 TRUE
3: x 0 y TRUE
4: 0 0 1 FALSE
5: 1 x x TRUE
6: 0 1 y TRUE
7: 1 0 0 FALSE
or
> start[, xy := rowSums(Reduce(`+`, Map(`==`, c("x", "y"), list(.SD)))) > 0][]
DAS1 DAS2 DAS3 xy
1: 0 x 1 TRUE
2: 1 y 1 TRUE
3: x 0 y TRUE
4: 0 0 1 FALSE
5: 1 x x TRUE
6: 0 1 y TRUE
7: 1 0 0 FALSE
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |