'What hypothesis does `broom::glance.lm()` extract? What is the F-statistic in `summary.lm`?

The broom::glance()makes it very easy to compare different models. As the help file doesn't specify what the statistic or p-value refers to. what hypothesis is being tested?

library(broom)
mod <- lm(mpg ~ wt + qsec, data = mtcars)
    glance(mod)
#> # A tibble: 1 x 12
#>   r.squared adj.r.squared sigma statistic  p.value    df logLik   AIC   BIC
#>       <dbl>         <dbl> <dbl>     <dbl>    <dbl> <dbl>  <dbl> <dbl> <dbl>
#> 1     0.826         0.814  2.60      69.0 9.39e-12     2  -74.4  157.  163.
#>   deviance df.residual  nobs
#>      <dbl>       <int> <int>
#> 1     195.          29    32

Created on 2022-02-24 by the reprex package (v2.0.1)



Solution 1:[1]

Looking at the glance.lm() function (see below), the function extracts information from summary.lm(). The F-statistic and its corresponding P-value compares the current model to an intercept-only model as indicated here.

It becomes clear when comparing glance(mod1) to summary(mod1) in that glance(mod1) "tidies" up the summary as motivated by the package (see vignette)

summary(mod)
#> Call:
#> lm(formula = mpg ~ wt + qsec, data = mtcars)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -4.3962 -2.1431 -0.2129  1.4915  5.7486 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  19.7462     5.2521   3.760 0.000765 ***
#> wt           -5.0480     0.4840 -10.430 2.52e-11 ***
#> qsec          0.9292     0.2650   3.506 0.001500 ** 
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.596 on 29 degrees of freedom
#> Multiple R-squared:  0.8264, Adjusted R-squared:  0.8144 
#> F-statistic: 69.03 on 2 and 29 DF,  p-value: 9.395e-12

The glance.lm() function:

getAnywhere("glance.lm")
A single object matching ‘glance.lm’ was found
It was found in the following places
  registered S3 method for glance from namespace broom
  namespace:broom
with value

function (x, ...) 
{
    warn_on_subclass(x)
    int_only <- nrow(summary(x)$coefficients) == 1
    with(summary(x), tibble(r.squared = r.squared, adj.r.squared = adj.r.squared, 
        sigma = sigma, statistic = if (!int_only) {
            fstatistic["value"]
        }
        else {
            NA_real_
        }, p.value = if (!int_only) {
            pf(fstatistic["value"], fstatistic["numdf"], 
                fstatistic["dendf"], lower.tail = FALSE)
        }
        else {
            NA_real_
        }, df = if (!int_only) {
            fstatistic["numdf"]
        }
        else {
            NA_real_
        }, logLik = as.numeric(stats::logLik(x)), AIC = stats::AIC(x), 
        BIC = stats::BIC(x), deviance = stats::deviance(x), df.residual = df.residual(x), 
        nobs = stats::nobs(x)))
}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 phargart