'How to filter variables by fold change difference in R

I'm trying to filter a very heterogeneous dataset.

I have numerous variables with several replicates each one. I have a factor with two levels (lets say X and Y), and I would like to subset the variables which present a fold change on its mean greater than 2 (X/Y >= 2 OR Y/X >= 2).

How can I achieve that in R? I can think of some ways but they seem too much of a hassle, I'm sure there is a better way. I would later run multivariate test on those filtered variables.

This would be an example dataset:

d <- read.table(text = "a b c d factor replicate
1 2 2 3 X      1
3 2 4 4 X      2
2 3 1 2 X      3
1 2 3 2 X      4
5 2 6 4 Y      1
7 4 5 5 Y      2
8 5 7 4 Y      3
6 4 3 3 Y      4", header = TRUE)

From this example, only variables a and c should be kept.



Solution 1:[1]

Using colMeans:

#subset
x <- d[ d$factor == "X", 1:4 ]
y <- d[ d$factor == "Y", 1:4 ]

# check colmeans, and get index
which(colMeans(x/y) >= 2 | colMeans(y/x) >= 2)
# a c 
# 1 3 

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 zx8754