'full join of multiple csv files in R
i know similar questions have been asked multiple times, but somehow I am stuck very near the solution. Doing some research did not bring me closer. It is possible that, as a beginner in R, i'm asking the wrong questions/searching the wrong keywords. In that case, you are welcome to help.
I have a folder containing multiple .csv files with ";" as delimiter. These files have two columns forming the id. All tables have a different size, e.g. the number of rows varies as does the number of columns.
As an example the data looks like this:
| name | extension | data1 | data2 | data n |
|---|---|---|---|---|
| paul | ch | 7 | 9 | n1 |
| sue | nc | 0 | 8 | n2 |
| sue | ch | 6 | 8 | n3 |
| etc. |
The data has the following logic: data1 includes all data from 1 sample (sample name: data1) and includes the number of occurencies of names, either nc or ch.
My aim is now to bring together all these tables into a combined data frame.
First I was trying iterating with a for function, but this gave poor results. Later I came accross a tidyverse approach as follows:
files <- dir(path = "~/Documents/data", pattern = "*.csv")
data <- files %>%
map(read_csv2(paste("~/Documents/data", files, sep = "/"))) %>%
reduce(full_join)
data
This starts working, but fails with the error:
0sError: Files must all have 31 columns: 0s
* File 2 has 34 columns
Here is where I'm stuck. I think the problem occurs when the dataframes are compiled into a list. there it seems to me, that it is a necessity for the dataframes to be of the same extensions. In the examples I found on stack and elsewhere, the example data has always the same length.
Can anybody help me find the mistake?
Best wishes
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
