'Histogram of MICE multiple imputed variable in R
After using the MICE package to impute missing data I am looking for a way to plot the distribution, using a histogram, of one of the imputed variables. I can use the following code to plot the distribution of "Ozone", however one histogram per imputed dataset (5 in total) is produced.
I am looking for a way to create 1 histogram which is the "pooled" result of the 5 histograms, if this is possible. Similar to how you would pool regression coefficients from MICE imputed datasets to get a final summary for the coefficients.
# Example dataset
data <- airquality
# Add missing data
data[4:10,3] <- rep(NA,7)
data[1:5,4] <- NA
data <- data[-c(5,6)]
# Impute missing data creating 5 datasets
imp <- mice::mice(data,m=5,maxit=50,meth='pmm',seed=500)
# Plot distribution of "Ozone" - Results in 5 plots, Aim is one "pooled" histogram
with(imp, hist(Ozone))
Solution 1:[1]
You can use the function merge_imputations to merge the imputations. You can use the following code:
# Example dataset
data <- airquality
# Add missing data
data[4:10,3] <- rep(NA,7)
data[1:5,4] <- NA
data <- data[-c(5,6)]
# Impute missing data creating 5 datasets
imp <- mice::mice(data,m=5,maxit=50,meth='pmm',seed=500)
# merge imp
merged_imp <- merge_imputations(data, imp)
# Plot distribution of "Ozone" - Results in 5 plots, Aim is one "pooled" histogram
with(merged_imp, hist(Ozone))
Output:
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Quinten |

