'Add predictors one by one in random forest
once more I will need your help in order to solve a syntax problem and I thank you for that. So I have a dataset that looks like that :
y <- rnorm(1000)
x1 <- rnorm(1000) + 0.2 * y
x2 <- rnorm(1000) + 0.2 * x1 + 0.1 * y
x3 <- rnorm(1000) - 0.1 * x1 + 0.3 * x2 - 0.3 * y
data <- data.frame(y, x1, x2, x3)
head(data)
#
I need a loop to run a random forest starting with one predictor and adding all the predictors one by one each time, like that:
randomForest(y ~ x1, data= data)
randomForest(y ~ x1 + x2, data= data)
randomForest(y ~ x1 + x2 + x3, data=data) etc...
Would you be kind enough to help me? Thank you in advance!
Solution 1:[1]
You can build the formula, and use as.formula()
lapply(1:3, \(i) {
formula = as.formula(paste0("y~",paste0("x",1:i, collapse="+")))
randomForest(formula, data=data)
})
A more general approach, for example if the predictors were not consistently named, or without specifying how many there are, would be to obtain a string vector of the predictors, say using colnames(), and adjust the loop slightly
predictors = colnames(data[,-1])
lapply(1:length(predictors), \(i) {
formula = as.formula(paste0("y~",paste0(predictors[1:i], collapse="+")))
randomForest(formula, data=data)
})
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
