'Native pipe with purrr::map_dfr()

I'd like to use the new native pipe,|>, with purrr::map_dfr(). (To make it reproducible, I'm passing the datasets as strings instead of paths, but that shouldn't make a difference.)

csvs <- c(
  "csv_a" = "a,b,c\n1,2,3\n4,5,6",
  "csv_b" = "a,b,c\n-1,-2,-3"
)
col_types <- readr::cols(.default = readr::col_character())

# Approach 1
csvs |> 
  purrr::map_dfr(
    .f = function(p) {
      readr::read_csv(
        file = I(p),
        col_types = col_types
      )
    }
  )

# Approach 2
library(magrittr)
csvs %>%
  purrr::map_dfr(
    .x = .,
    .f = ~readr::read_csv(
      file      = I(.),
      col_types = col_types
    )
  )

I have two questions, mostly to continue my understanding of the native pipe.

Question 1

How do I replace the explicit function(p) part with the new {\(x)...}() syntax? The attempt below throws "Error in standardise_path(file) : argument "p" is missing, with no default".

csvs |> 
  purrr::map_dfr(
    .f = 
      {\(p)
        readr::read_csv(
          file      = I(p),
          col_types = col_types
        )
      }()
  )

Question 2

Can I also mimic the magrittr approach (#2)? This somehow reads each row twice, including the header.

csvs |> 
  {\(p)
    purrr::map_dfr(
      .x = p,
      .f = ~readr::read_csv(
        file      = I(p),
        col_types = col_types
      )
    )
  }()

# Produces
# A tibble: 8 x 3
  a     b     c    
  <chr> <chr> <chr>
1 1     2     3    
2 4     5     6    
3 a     b     c    
4 -1    -2    -3   
5 1     2     3    
6 4     5     6    
7 a     b     c    
8 -1    -2    -3   

edit: In response to @MrFlick's comment, I've wrapped the argument to file with I() in case that becomes a requirement in future versions of readr (it seems to work fine now without it). If you're passing typical file paths (instead of literal strings), remove the call to I().



Solution 1:[1]

Answer for Question 1 -

csvs |> 
  purrr::map_dfr(
    .f = \(k) {
      readr::read_csv(
        file      = k,
        col_types = col_types
      )
    }
  )

#     a     b     c
   <chr> <chr> <chr>
#1     1     2     3
#2     4     5     6
#3    -1    -2    -3

Solution 2:[2]

Answer for Question 2: for the inner function, you use p, which reuses csvs on each call. So the inner function ignores the value its mapping over and instead uses the whole list. You may avoid that using the .x pronoun:

csvs |> 
  {\(p)
    purrr::map_dfr(
      .x = p,
      .f = ~readr::read_csv(
        file      = I(.x),
        col_types = col_types
      )
    )
  }()

Stylistically, it might be nicer to avoid the formula mapper altogether, since you don't have any custom behavior in your function. The ... in purrr::map_dfr will be passed on to the function on each call.1

csvs |> 
  {\(p) purrr::map_dfr(.x = p, .f = readr::read_csv, col_types = col_types)}()

Since you don't reuse the p argument, the anonymous function is also unnecessary:

csvs |> 
  purrr::map_dfr(.f = readr::read_csv, col_types = col_types)

1@MrFlick is correct in that I() should be used in principle if you're expecting strings instead of a file name, however in your case, you do not need it because there is a newline in all strings in the csvs vector. See here for details. I take it out to illustrate your alternatives.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 wibeasley
Solution 2 Bob Zimmermann