'How to create a new variable with the imputed MICE data
I have imputed my data with mice, but now I need to create an indexes with the imputed variables. I want this new variable to be present in all of my iterations of the imputed data in order to pool the regression results afterwards. I'll be using the dataset (nhanes) included with mice to explain my problem
library(mice)
data_imp <- mice(nhanes,
m = 5,
maxit = 10,
seed = 23109)
This data set has 4 variables (age, bmi, hyp, chl). Imagine I would like to add to my imputed list for all 5 models, a new variable with the mean of bmi and chl which would be bmi_chl.
I want to use a for loop, so it does the operation for all 5 of the models.
for (i in 1:length(data_imp)) {data_imp$imp[[i]]$bmi_chl <-
rowMeans(data_imp$imp[[i]][,c("bmi","chl")], na.rm = T)
}
First the code doesn't work. But I've noticed another problem in the way the data is stored. There is the original data in one item, and the imputed lists in another. However, the imputed list does not have the original data, only the imputed data entries... How could I manage to get my new variable in all 5 imputed models with all observations?
Solution 1:[1]
Once you have extracted the imputed data frames, you can use lapply to add a new column to the rest of the data:
library(mice)
data_imp <- mice(nhanes,
m = 5,
maxit = 10,
seed = 23109)
imputed.dfs <- complete(data_imp, "all")
imputed.dfs <- lapply(imputed.dfs, function(x) data.frame(x, bmi_chl=rowMeans(x[,c("bmi", "chl")])))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
