'Function to read specific rows and columns of a file in a folder
I have, for example, 2 *.csv files in a folder. I would like to create a function that read line by line columns 1 and 3 of each file and save each line as a vector in a list. To the last, merge these lines by name of the column 1.
The 2 example of files:
set.seed(10)
df1 <- tibble(var1 = c("sp1", "sp2", "sp3", "sp4", "sp5", "sp6", "sp7", "sp8", "sp9", "sp10"),
var2 = round(rnorm(10), 1),
var3 = round(rnorm(10), 3))
set.seed(11)
df2 <- tibble(var1 = c("sp1", "sp2", "sp3", "sp4", "sp5", "sp6", "sp7", "sp8", "sp9", "sp10"),
var2 = round(rnorm(10), 1),
var3 = round(rnorm(10), 3))
output expected:
var1 df1$var3 df2$var3
<chr> <dbl> <dbl>
1 sp1 1.10 -0.828
2 sp2 0.756 -0.348
3 sp3 -0.238 -1.54
4 sp4 0.987 -0.256
5 sp5 0.741 -1.15
6 sp6 0.089 0.012
7 sp7 -0.955 -0.223
8 sp8 -0.195 0.888
9 sp9 0.926 -0.592
10 sp10 0.483 -0.656
Solution 1:[1]
This is just a join operation. Just read in both files as a data.frame, then drop var2 and join them on var1:
library(dplyr)
df1 <- select(df1, -var2)
df2 <- select(df2, -var2)
full_join(df1, df2, by = 'var1', suffix = c('.df1', '.df2'))
# A tibble: 10 × 3
var1 var3.df1 var3.df2
<chr> <dbl> <dbl>
1 sp1 1.10 -0.828
2 sp2 0.756 -0.348
3 sp3 -0.238 -1.54
4 sp4 0.987 -0.256
5 sp5 0.741 -1.15
6 sp6 0.089 0.012
7 sp7 -0.955 -0.223
8 sp8 -0.195 0.888
9 sp9 0.926 -0.592
10 sp10 0.483 -0.656
Your choice of join type depends on what you want to do if the 2 tables don't have the exact same selection of values in var1
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | divibisan |
