'Function to read specific rows and columns of a file in a folder

I have, for example, 2 *.csv files in a folder. I would like to create a function that read line by line columns 1 and 3 of each file and save each line as a vector in a list. To the last, merge these lines by name of the column 1.

The 2 example of files:

set.seed(10)
df1 <- tibble(var1 = c("sp1", "sp2", "sp3", "sp4", "sp5", "sp6", "sp7", "sp8", "sp9", "sp10"),
              var2 = round(rnorm(10), 1),
              var3 = round(rnorm(10), 3))
set.seed(11)
df2 <- tibble(var1 = c("sp1", "sp2", "sp3", "sp4", "sp5", "sp6", "sp7", "sp8", "sp9", "sp10"),
              var2 = round(rnorm(10), 1),
              var3 = round(rnorm(10), 3))

output expected:

   var1  df1$var3 df2$var3
   <chr>    <dbl>    <dbl>
 1 sp1      1.10    -0.828
 2 sp2      0.756   -0.348
 3 sp3     -0.238   -1.54 
 4 sp4      0.987   -0.256
 5 sp5      0.741   -1.15 
 6 sp6      0.089    0.012
 7 sp7     -0.955   -0.223
 8 sp8     -0.195    0.888
 9 sp9      0.926   -0.592
10 sp10     0.483   -0.656


Solution 1:[1]

This is just a join operation. Just read in both files as a data.frame, then drop var2 and join them on var1:

library(dplyr)

df1 <- select(df1, -var2)
df2 <- select(df2, -var2)

full_join(df1, df2, by = 'var1', suffix = c('.df1', '.df2'))
# A tibble: 10 × 3
   var1  var3.df1 var3.df2
   <chr>    <dbl>    <dbl>
 1 sp1      1.10    -0.828
 2 sp2      0.756   -0.348
 3 sp3     -0.238   -1.54 
 4 sp4      0.987   -0.256
 5 sp5      0.741   -1.15 
 6 sp6      0.089    0.012
 7 sp7     -0.955   -0.223
 8 sp8     -0.195    0.888
 9 sp9      0.926   -0.592
10 sp10     0.483   -0.656

Your choice of join type depends on what you want to do if the 2 tables don't have the exact same selection of values in var1

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 divibisan