'Removing nested variables if there are NAs in certain variables inside the nested variable

I have a dataframe that looks something like this:

df <- data.frame(gvkey = c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6), 
date = c(01,02,03,01,02,03,01,02,03,01,02,03,01,02,03,01,02,03),
                 var1 = c(3,6,9,6,3,1,NA,NA,NA,5,4,6,7,9,4,NA,1,3),
                 Var2 = c(NA,3,6,4,6,NA,6,9,7,5,1,4,7,4,NA,5,9,5),
                 var3 = c(NA,NA,NA,6,3,7,8,5,4,9,6,NA,1,6,4,2,6,4),
                 Var4 = c(4,NA,6,2,9,7,4,8,6,NA,NA,NA,9,7,6,2,6,4));df

There are different firm observations on the y axis, represented by "gvkey", and firm specific variables on the x axis. I want to remove firms which has only NA observations for specific variables. E.g. "gvkey" 1 has NA for all "var3", hence I want to be able to remove all rows with "gvkey" 1. The same for "gvkey" 3 in "var1". I want to be able to control which variables I do this for.

What I want to end up with is something like this:

df1 <- data.frame(gvkey = c(2,2,2,4,4,4,5,5,5,6,6,6),
                 date = c(01,02,03,01,02,03,01,02,03,01,02,03,01,02,03,01,02,03),
                 var1 = c(6,3,1,5,4,6,7,9,4,NA,1,3),
                 Var2 = c(4,6,NA,5,1,4,7,4,NA,5,9,5),
                 var3 = c(6,3,7,9,6,NA,1,6,4,2,6,4),
                 Var4 = c(2,9,7,4,6,7,9,7,6,2,6,4));df

I have tried to group by gvkey and nest, and then filter inside the nest:

`df <- df %>% group_by(gvkey) %>% nest %>% mutate(model = map(data, ~filter(., !all(is.na(var1))))) %>% unnest(cols = c(data, model))`

but I just get

Error: can't recycle input of size "x" to size 0.

Any solutions?



Solution 1:[1]

Maybe this helps you a little, it doesn't give you the control (it checks it for all columns) to which variables to use, but maybe you can figure something out with it.

df %>% 
  group_split(gvkey) %>% 
  map(~.x %>% 
         select(
    where(
      ~sum(!is.na(.)) > 0
    )
  )) %>% 
  purrr::discard(., ~length(.x) < ncol(df)) %>% 
  bind_rows()

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Julian