'Different prediction results for a glm between sjPlot get_model_data and predict.glm, especially, but not only, with an averaged model

I run a glm with negative binomial to test the effect of two categorical variables on count data. I then used model average for all possible models and tried to predict the model's response using two methods (see code below):

  1. using predict
  2. using get_model_data from sjPlot

The model fit gives the same results, but the CI are much bigger with predict. I also tried this with a simple model (not averaged) and got closer results, but still different.

m1 = glm.nb(C~ A*B, data = Route) #A has two levels and B has three levels
d1 = dredge(m1)
m.avg1 = model.avg(d1, fit = T)

newdat = expand.grid(B= c("B1","B2","B3"), A = c("A1", "A2"))

as.data.frame(predict(m.avg1, newdata = newdat, se.fit = T, type = "response")) %>% 
  mutate(CI = 1.96*(se.fit), CI.u = fit+CI, CI.l = fit-CI) %>%
  cbind(.,newdat)

get_model_data(m.avg1,type="pred", terms = c("B", "A"), se = T) #although I use se = T, I do not get se in the output, just 95% CI

Here is a link to the data on google sheets

An example of the results:

  • With predict, for A1*B1 I get predicted fit = 13.24, 95% CI = [10.23, 16.24]
  • With get_model_data, for A1*B1 I get predicted fit = 13.24, 95% CI = [13.01, 13.46]

When doing the same with for the original model m1 I get closer results - a difference of only 0.31 in CI between the two options (for the same example as above). Yet, this is still a difference which keeps me up at night...



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source