'Cobalt love.plot has asymmetric error bars that I cannot remove
I have a quasi-experimental dataset with two groups and two assessments (pre-intervention and post-intervention) that has a little bit of pre-test missingness (between <1% to 5%, depending on variable). I am using propensity-score matching to make claims of intervention effectiveness. I first used mice to multiple impute 50 complete pre-test datasets. I then used matchThem to conduct 2:1 propensity score matching. Here's my abbreviated code for the PSM (x stands for pretest variable e.g., race):
m.out = matchthem(group ~ x1+ x2.... + x17,
data = df,
approach = "within",
method = "nearest",
ratio = 2)
Everything looks good. I then made a love plot. The love plot looks nice, but both the matched and unmatched data have "error bars." None of the help docs I've seen for cobalt show error bars and there doesn't seem to be a command for removing them. They are asymmetric, which raised a reviewer concern. I get error bars when using love.plot with little customization:
love.plot(m.out, binary = "std")
and when using love.plot with lots of customization:
v <- data.frame(old = c("distance","x1",.... "x17"),
new = c("Propensity score", "Wave",... "Search for purpose"))
love.plot(m.out,
threshold = c(m = .1),
binary = "std",
abs = FALSE,
var.order = "unadjusted",
var.names = v,
limits = c(-0.5, 0.5),
grid = FALSE,
wrap = 100,
sample.names = c("Unmatched", "Matched"),
position = "top",
shapes = c("circle", "triangle"),
drop.distance=TRUE,
disp.sds =FALSE,
colors = c("gray10", "gray50")
)
Anyone know how to remove the error bars or what statistics they are based on?
I'm using R studio 2021.09.2 Build 382, cobalt 4.3.2
Solution 1:[1]
Because you performed matching in multiply imputed data, each imputed dataset will have a different level of balance for each covariate. The line goes from the minimum to the maximum balance statistic for each covariate across imputations with a point at the mean. This is explained in the agg.fun
argument to love.plot()
, which you can use to control whether ranges or just a point is displayed, and is described in the vignette on multiply imputed data. The subtitle of the plot also indicates that the range of balance statistics across imputations is being displayed.
I think it's important to see the range because it lets you know whether any of the imputed datasets have balance statistics that may be large, which is true in your case even though on average they are small. While the average matters most since the effect estimate is the average effect across imputations, it is best in terms of precision for all imputed datasets to have small balance statistics. It can also be helpful to see the variability on the values of the imputed covariate values in the unadjusted sample; for example, there seems to be a fair bit of variation in the Shared humanity variable across imputations, which might indicate that you are not imputing it very well (or there is a lot of missing data in that variable).
They are asymmetric because the range of balance statistics doesn't have to be symmetrically arranged around the mean balance statistic. They are not error bars indicating confidence interval bounds or anything like that. They are just a line from a minimum to a maximum.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Noah |