'Unbalanced panel error in PMG Analysis in R

I am trying to run a Fama Macbeth analysis in R, where I am using the 'pmg' function with the following code:

Fpmg1 <- pmg(ret ~ HML_OBS + SMB + Mktrf + HML, Analysis4_Weighted, index = c("permno"))
summary(Fpmg1)

I currently have 1,354,623 entries and 11 total columns. I get the below output where the estimates for my coefficients are NA.

Mean Groups model

Call:
pmg(formula = ret ~ HML_OBS + SMB + Mktrf + HML, data = Analysis4_Weighted, 
    index = c("date", "permno"))

Unbalanced Panel: n = 295, T = 3567-6287, N = 1349058

Residuals:
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
-1.065356 -0.077703 -0.008573  0.000000  0.060437 19.741368 

Coefficients:
             Estimate Std. Error z-value Pr(>|z|)   
(Intercept) 0.0110395  0.0034105   3.237 0.001208 **
HML_OBS            NA         NA      NA       NA   
SMB                NA         NA      NA       NA   
Mktrf              NA         NA      NA       NA   
HML                NA         NA      NA       NA   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Total Sum of Squares: 50764
Residual Sum of Squares: 45906
Multiple R-squared: 0.0957

I have sorted on the following before running the model:

Analysis4_Weighted <- 
  Analysis4_Weighted %>%
  dplyr::filter(!is.na(HML_OBS))

Analysis4_Weighted <- 
  Analysis4_Weighted %>%
  dplyr::filter(!is.na(ret))

Analysis4_Weighted <- 
  Analysis4_Weighted %>%
  group_by(date) %>%
  dplyr::filter(n() > 10)

Do you know why I do not get any estimates?

My data consists of various returns on different stocks in a long time period, and I trying to test the coefficients' ability to predict stock returns over the period across various stocks.

Thank you!



Solution 1:[1]

It may be due to that pmg requires that for the cross-sectional regressions for each permno that you have n+1 times series for n factors for each permno. You may not have n+1 times series for each permno. You would need to either generate data for the missing time series or eliminate permno's that do not have enough time series for estimation.

Solution 2:[2]

From this line in the output

pmg(formula = ret ~ HML_OBS + SMB + Mktrf + HML, data = Analysis4_Weighted, 
    index = c("date", "permno"))

we can see that you (implictly or explictly) defined a variable called date as the first index variable. The first index variable is meant to be the unit of observations (often called individual index), while the 2nd index variable is supposed to be the time periods. Very likely your variable date is the time periods and should go into the 2nd slot and permno into the first slot of the index argument.

Try to specify your pdata.frame explicitly beforehand and use it in the estimation with pmg, i.e., something along these lines:

pdat <- pdata.frame(Analysis4_Weighted, index = c("permno", "date"))
pmg(formula = ret ~ HML_OBS + SMB + Mktrf + HML, data = pdat))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Richard Gregory
Solution 2 Helix123