'How can I calculate F statistics for the Gamma regression in R manually?

I'm trying to understand how anova calculates F-value for a Gamma glm. I have some weird skewed data:

y <- c(0, 0.88, 0.94, 0, 0.95, 0.77, 3.22, 3.52, 1.22, 1.52, 1.23, 
        0.92, 1.11, 1.18, 1.47, 1.53, 0, 0, 1.09, 0.83, 0.8, 1.56, 6, 
        0.74, 1.18, 1.01, 0.82, 3.83, 1.75, 1.27, 1.54, 1.05, 1.08, 0.9, 
        0.77, 1.44, 4.55, 0, 1.44, 2.91, 0.71, 12.93, 0.77, 0, 1.14, 
        1.06, 3.96, 1.57, 1.63)
x <- c(6.9469287, 6.290469147, 6.1918829, 6.104770097, 5.939523496, 
      5.942857082, 6.163662277, 6.399218779, 5.783065061, 5.638420345, 
      5.552741687, 5.683432022, 5.857426116, 6.162680044, 5.957396843, 
      6.571818964, 5.446848271, 5.712962062, 5.653265224, 6.349141363, 
      5.46503105, 6.049651518, 7.380125424, 5.722479551, 5.950585693, 
      5.808206582, 6.096318404, 5.913429847, 5.997807119, 6.206943676, 
      6.550982371, 6.543636484, 6.822385253, 6.507588297, 5.940914702, 
      6.439753879, 6.899586949, 6.156580921, 7.116019293, 6.355315455, 
      6.538796291, 6.498027706, 6.196593891, 6.339028678, 6.23909998, 
      6.551869452, 6.688031206, 6.492259138, 5.997315277)
 y <- y + 0.001

I added 0.001 to y to avoid zeros. For a simple regression I could reproduce the F test run by anova:

lm0 <- lm(y ~ 1)
lm1 <- lm(y ~ x)
#
y.p <- lm1$fitted.values # predicted/fitted values
SSE <- sum((y - y.p)^2)
SSR <- sum((y.p - mean(y))^2)
SST <- sum((y - mean(y))^2)
round(SST - (SSE + SSR), 4) #check
# [1] 0
#
SS1 <- sum(residuals(lm0, "deviance")^2) #=SST
SS2 <- sum(residuals(lm1, "deviance")^2) #=SSE
df1 <- lm0$df.residual
df2 <- lm1$df.residual
MSE <- SS2/df2 
MSR <- ((SS1 - SS2)/(df1 - df2))
MSR/MSE # F-value
# [1] 5.927608
anova(lm0, lm1, test="F")$F[2]
# [1] 5.927608

However, I could not reproduce F for a Gamma-version of the regression:

lm0 <- glm(y ~ 1, family=Gamma(link="log"))
lm1 <- glm(y ~ x, family=Gamma(link="log"))
#
oo <- Gamma(link="log") # family info
y.p <- oo$mu.eta(eta) # fitted values on the original scale
# ... the same as for lm example above
MSR/MSE # F-value
# [1] 3.862559
anova(lm0, lm1, test="F")$F[2]
# [1] 7.356901

However if I take the MSR, which is obviously called "Deviance" in the anova-output and divide it by the Dispersion parameter from the model summary (which strangely for me can be produced from working residuals) I'll get the correct F:

# correct F for the gamma regression:
disp <- summary(lm1)$dispersion
mdisp <- sum(residuals(lm1, "working")^2)/df2 # MSE-variant with working residuals
disp - mdisp # check
# [1] 0
Dev <- anova(lm1)$Deviance[2]
MSR-Dev # check
# [1] 0
MSR/mdisp # correct F as in anova
# [1] 7.356901

For me (w/o a deeper mathematical education) these last manipulations - which I found through try and error - look like magic. Could somebody help me to understand how the link-function of the Gamma glm is "interwoven" in the MSR/MSE calculation? I need the understanding to be able to calculate F for a Gamma regression performed with the R fastglm package, which is not compatible with anova.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source