'Unnesting a list of lists in a data frame column

To unnest a data frame I can use:

df <- data_frame(
    x = 1,
    y = list(a = 1, b = 2)
)

tidyr::unnest(df)

But how can I unnest a list inside of a list inside of a data frame column?

df <- data_frame(
    x = 1,
    y = list(list(a = 1, b = 2))
)
tidyr::unnest(df)

Error:

Each column must either be a list of vectors or a list of data frames [y]



Solution 1:[1]

Note: Ignore the original and Update 1; Update 2 is better with the current state of the tidyverse.


Original:

With purrr, which is nice for lists,

library(purrr)

df %>% dmap(unlist)
## # A tibble: 2 x 2
##       x     y
##   <dbl> <dbl>
## 1     1     1
## 2     1     2

which is more or less equivalent to

as.data.frame(lapply(df, unlist))
##   x y
## a 1 1
## b 1 2

Update 1:

dmap has been deprecated and moved to purrrlyr, the home of interesting but ill-fated functions that will now shout lots of deprecation warnings at you. You could translate the base R idiom to tidyverse:

df %>% map(unlist) %>% as_tibble()

which will work fine for this case, but not for more than one row (a problem all these approaches face). A more robust solution might be

library(tidyverse)

df %>% bind_rows(df) %>%    # make larger sample data
    mutate_if(is.list, simplify_all) %>%    # flatten each list element internally 
    unnest()    # expand
#> # A tibble: 4 × 2
#>       x     y
#>   <dbl> <dbl>
#> 1     1     1
#> 2     1     2
#> 3     1     1
#> 4     1     2

Update 2:

At some point since this was asked, tidyr::unnest() got updated such that it doesn't error anymore, so you can just do

df %>%
    unnest(y) %>% 
    unnest(y)
#> # A tibble: 2 × 2
#>       x     y
#>   <dbl> <dbl>
#> 1     1     1
#> 2     1     2

If you care about the names in the list, pull them out first and then unnest the names and the list at the same time:

df %>%
    mutate(label = map(y, names)) %>%
    unnest(c(y, label)) %>% 
    unnest(y)
#> # A tibble: 2 × 3
#>       x     y label
#>   <dbl> <dbl> <chr>
#> 1     1     1 a    
#> 2     1     2 b

I'll leave the previous answers for continuity, but this is simpler.

Solution 2:[2]

This can be done in a simple step using unnest_longer() since tidyr 1.0.0 :

df <- tibble::tibble(
  x = 1,
  y = list(list(a = 1, b = 2))
)

library(tidyr)
unnest_longer(df,y,indices_include = FALSE)
#> # A tibble: 2 x 2
#>       x     y
#>   <dbl> <dbl>
#> 1     1     1
#> 2     1     2

Created on 2019-09-14 by the reprex package (v0.3.0)

Solution 3:[3]

All answers are kind of deprecated now; for the given task, I see two solutions:

tidyr::unnest(df, y) %>% tidyr::unnest(y)

does what you want, as does

dplyr::mutate(df, y = purrr::map(y, unlist)) |> tidyr::unnest(y)

although it is longer. I do not really see a good case to unnesting more than one list column in one operation, because the handling of differently sized lists inside the same row would lead to problems.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 moodymudskipper
Solution 3 Lukas