'Plotting outcome estimates from multiple models in the same plot, sort estimates according to group and tidying model outputs
I would like to plot multiple models in the same plot. But to have the models sorted/grouped in "categories" on the plot, and not according to the independent variable (treatment, dummy 0-1) which is the same in all models.
In the solution below, it is unclear to where the model coefficient is plotted and if they are in the correct group. I would to keep it in the ggplot2 "environment".
rm(list=ls())
# Load packages
library(lfe)
library(dotwhisker)
library(broom)
library(dplyr)
library(stargazer)
set.seed(7)
# Create df. Panel data in long format. One observation equals one county-year
year <- c(2007, 2007, 2007, 2007, 2007, 2008, 2008, 2008, 2008, 2008, 2009, 2009, 2009, 2009, 2009, 2010,
2010, 2010, 2010, 2010, 2011, 2011, 2011, 2011, 2011, 2012, 2012, 2012, 2012, 2012)
county <- c("county1", "county2", "county3", "county4", "county5",
"county1", "county2", "county3", "county4", "county5",
"county1", "county2", "county3", "county4", "county5",
"county1", "county2", "county3", "county4", "county5",
"county1", "county2", "county3", "county4", "county5",
"county1", "county2", "county3", "county4", "county5")
treatment <- sample(c(0,1), replace=TRUE, size = 30)
roads_accounts <- runif(30, -100, 100)
schools_accounts <- runif(30, -100, 100)
elder_accounts <- runif(30, -100, 100)
leisure_accounts <- runif(30, -100, 100)
police_accounts <- runif(30, -100, 100)
administrative_accounts <- runif(30, -100, 100)
libraries_accounts <- runif(30, -100, 100)
roads_budget <- runif(30, -100, 100)
schools_budget <- runif(30, -100, 100)
elder_budget <- runif(30, -100, 100)
leisure_budget <- runif(30, -100, 100)
police_budget <- runif(30, -100, 100)
administrative_budget <- runif(30, -100, 100)
libraries_budget <- runif(30, -100, 100)
D <- data.frame(year, county, treatment, roads_accounts, schools_accounts, elder_accounts, leisure_accounts,
police_accounts, administrative_accounts, libraries_accounts, roads_budget, schools_budget,
elder_budget, leisure_budget, police_budget, administrative_budget, libraries_budget)
# Estimate models, 14 models in total
model.list = vector(mode = "list", length = 14)
j = 1
for (i in c("roads_accounts", "schools_accounts", "elder_accounts", "leisure_accounts",
"police_accounts", "administrative_accounts", "libraries_accounts", "roads_budget",
"schools_budget", "elder_budget", "leisure_budget", "police_budget", "administrative_budget",
"libraries_budget"))
{
temp.dta = data.frame(y = D[, i], D[, (!colnames(D) %in% c("roads_accounts", "schools_accounts", "elder_accounts", "leisure_accounts",
"police_accounts", "administrative_accounts", "libraries_accounts", "roads_budget",
"schools_budget", "elder_budget", "leisure_budget", "police_budget", "administrative_budget",
"libraries_budget"))])
model.list[[j]] <- felm(y ~ treatment | factor(county) + factor(year) | 0 | county, data = temp.dta)
j = j + 1
}
# Plot models
p <- dwplot(model.list)
category <- rep(c("Account numbers", "Budget numbers"), 7)
groups <- rep(c("Roads", "Schools", "Elderly", "Leisure", "Police", "Administrative", "Libraries"), 2)
p$layers <- lapply(p$layers, function(x) {
x$data$model <- category
x$data$term <- groups
x})
p + scale_color_manual(values = c("red4", "blue4")) +
geom_vline(xintercept = 0, linetype = 2) +
theme_minimal()
My output:
I do not understand the order that the groups appear in. I specified it as "Roads", "Schools", "Elderly", "Leisure", "Police" etc., however, the groups appear in a different order on the graph. So it is unclear to me whether the group represents the correct models (i.e. account and budget numbers for e.g. Roads). I'm pretty sure it is not correct atm
In addition to a figure, I am trying to figure out how to extract information from the models in a tidy output like a dataframe, where I can do some statistics across the models, e.g. finding the median coefficient etc, and create a stargazer table with basic statistics from all the models. I have tried this so far:
# tidy(model.list) # Does not work since it is a list, so:
lapply(model.list, tidy, conf.int = TRUE) # this works. Now I need to save it somehow and get the statistics that I need.
## create a df and/or stargazer table with the tidy model output. Including the following as columns: outcome; coefficient; SEs; P-values; CIs and number of observations, N:
df_models <- model.list %>%
lapply(model.list, tidy, conf.int = TRUE) ## Does not work
stargazer(model.list, type = "text") ## Works but wrong output
Solution 1:[1]
dwplot produces a ggplot, and the problem with convenience functions like this is that you gain ease-of-use but lose the ability to fully customize your output.
If dwplot it does not lay out your ggplot the way you want, your options are to either:
- Respecify your models.
- Extract the model coefficients from the model and build the plot manually.
- Try to coerce
dwplotto plot your models the way you want. - Change the ggplot object to conform to your requirements.
I assume you don't want to re-specify your models, and from looking at the docs, it's probably not possible to get dwplot to change its output to the one you want. Of the two remaining options, the one that requires least code is probably number 4; that is, change the ggplot object.
First store the dwplot output:
p <- dwplot(model.list)
Now define your "colour groups" and your "y axis groups" as vectors in the order you want them to appear on the plot.
colors <- rep(c("1", "2"), 3)
groups <- rep(c("A", "B", "C"), 2)
The tricky bit comes in writing these into the plot layers:
p$layers <- lapply(p$layers, function(x) {
x$data$model <- colors
x$data$term <- groups
x})
Now we can plot our result, and add a colour scheme, a theme, and a vertical line at x = 0, as in the desired plot.
p + scale_color_manual(values = c("red4", "blue4")) +
geom_vline(xintercept = 0, linetype = 2) +
theme_classic()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Allan Cameron |


