'Residual standard error in survey package

I am trying to calculate the residual standard error of a linear regression model using the survey package. I am working with a complex design, and the sampling weight of the complex design is given by "weight" in the code below.

fitM1 <- lm(med~x1+x2,data=pop_sample,weight=weight)  
fitM2 <- svyglm(med~x1+x2,data=pop_sample,design=design)

First, if I call "summary(fitM1)", I get the following:

Call: lm(formula=med~x1+x2,data=pop_sample,weights=weight)

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.001787   0.042194   0.042    0.966    
x1           0.382709   0.061574   6.215 1.92e-09 ***
x2           0.958675   0.048483  19.773  < 2e-16 ***

Residual standard error: 9.231 on 272 degrees of freedom
Multiple R-squared:  0.8958,    Adjusted R-squared:  0.8931 
F-statistic: 334.1 on 7 and 272 DF,  p-value: < 2.2e-16

Next, if I call "summary(fitM2)", I get the following:

summary(fitM2)

Call: svyglm(formula=med~x1+x2,data=pop_sample,design=design)

Survey design: svydesign(id=~id_cluster,strat=~id_stratum,weight=weight,data=pop_sample)

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.001787   0.043388   0.041 0.967878    
x1           0.382709   0.074755   5.120 0.000334 ***
x2           0.958675   0.041803  22.933 1.23e-10 ***

When using "lm", I can extract the residual standard error by calling:

fitMvariance <- summary(fitM1)$sigma^2

However, I can't find an analogous function for "svyglm" anywhere in the survey package. The point estimates are the same when comparing the two approaches, but the standard errors of the coefficients (and, presumably, the residual standard error of the model) are different.

r survey

Solution 1:^[1]

Survey Analysis

use the library survey in the r to perform survey analysis, it offers a wide range of functions to calculate the statistics like Percentage, Lower CI, Upper CI, population and RSE.

RSE

we can use thesvyby function in the survey package to get all the statistics including the Root squared error

library("survey")
Survey design: svydesign(id=~id_cluster,strat=~id_stratum,weight=weight,data=pop_sample)
  
svyby(~med, ~x1+x2, design, svytotal, deff=TRUE, verbose=TRUE,vartype=c("se","cv","cvpct","var"))

The cvpct will give the root squared error

Refer for further information svyby

Solution 2:^[2]

Because svyglm is built on glm not lm, the variance estimate is called $dispersion rather than $sigma

> data(api)
> dstrat<-svydesign(id = ~1, strata = ~stype, weights = ~pw, data = apistrat, 
+     fpc = ~fpc)
> model<-svyglm(api00~ell+meals+mobility, design=dstrat)
> summary(model)$dispersion
     variance     SE
[1,]     5172 492.28

This is the estimate of $\sigma^2$, which is the population residual variance. In this example we actually have the whole population, so we can compare

> popmodel<-lm(api00~ell+meals+mobility, data=apipop)
> summary(popmodel)$sigma
[1] 70.58365
> sqrt(5172)
[1] 71.91662

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Community
Solution 2	Thomas Lumley

'Residual standard error in survey package

Solution 1:[1]

Survey Analysis

RSE

Solution 2:[2]

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]