'plm::lag isn't lagging. How to deal with lags in panel data

I've scoured SO and it seems others have had this same question, but the solutions aren't working for me. I have a reprex for you as follows:

name<-c("Jim", "Jim", "Jim", "Bob", "Bob", "Bob")
number<-c(1,2,3,1,2,3)

panel<-data.frame(name, number)

panel<-panel%>%
  group_by(name)%>%
  mutate(lagged= plm::lag(number, 1))

For me, this doesn't return anything different than what I put in and I have no idea why. I thought plm::lag would lag my variable while dealing with the panel structure, but it doesn't appear to be working. I've tried with and without the group_by but neither works.

Also open to lagging the variable within a plm() regression although I'm cautious of the black box.

TYIA



Solution 1:[1]

plm::lag is the exactly the same as stats::lag. The difference is that the plm package also provides lag.pseries which works on pseries objects.

Create a pdata.frame where the individuals are given by the first column, the time is the second column and subsequent columns are pseries data, here just column a. Then we can apply lag to a.

In the code below be sure that

  1. dplyr is not loaded or else
  2. use plm::lag in place of lag or else
  3. load dplyr using library(dplyr, exclude = c("lag", "filter"))

since dplyr clobbers R's own lag.

library(plm)

name <- c("Jim", "Jim", "Jim", "Bob", "Bob", "Bob")
number <- c(1,2,3,1,2,3)
a <- 1:6
pd <- pdata.frame(data.frame(name, number, a))

pd2 <- pd
pd2$lag_a <- lag(pd2$a)

pd2
##       name number a lag_a
## Bob-1  Bob      1 4    NA
## Bob-2  Bob      2 5     4
## Bob-3  Bob      3 6     5
## Jim-1  Jim      1 1    NA
## Jim-2  Jim      2 2     1
## Jim-3  Jim      3 3     2

Update

Improved answer.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1