'Appending many files into 1 dataset
I have many ".csv" files in one folder with names "file1", "file2", "file3", ... "file50". They all have totally the same structure. Expression for reading one file is:
read.csv(file = "file1.csv", delim = ";", dec = ".")
I need to union (append) all these files in one dataset. How to do it in the most short way?
Solution 1:[1]
The idea here is to make a table of all your files, then apply the read_csv() function to each of those files (rows of your table) to create a new column with the data from each csv file.
Finally, use bind_rows() on the data column to get your appended dataset. You could call distinct() on the result before doing the bind_rows() call to get the union of all these files (i.e. no duplicates).
For example, this could work:
library(tidyverse)
result <- tibble(fns = dir(path = "C:/Users/ncama/Downloads", pattern = "csv", full.names = T)) %>%
mutate(data = map(fns, read_csv, show_col_types = F)); result
#> # A tibble: 5 x 2
#> fns data
#> <chr> <list>
#> 1 C:/Users/ncama/Downloads/csv1 - Copy (2).csv <spec_tbl_df [1 x 2]>
#> 2 C:/Users/ncama/Downloads/csv1 - Copy (3).csv <spec_tbl_df [1 x 2]>
#> 3 C:/Users/ncama/Downloads/csv1 - Copy (4).csv <spec_tbl_df [1 x 2]>
#> 4 C:/Users/ncama/Downloads/csv1 - Copy.csv <spec_tbl_df [1 x 2]>
#> 5 C:/Users/ncama/Downloads/csv1.csv <spec_tbl_df [1 x 2]>
bind_rows(result$data)
#> # A tibble: 5 x 2
#> A B
#> <chr> <chr>
#> 1 hello world
#> 2 hello world
#> 3 hello world
#> 4 hello world
#> 5 hello world
Created on 2022-04-20 by the reprex package (v2.0.1)
Where obviously you should change the path to the path of your folder. If you all these files have the csv extension, this should work.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Nick Camarda |
