'How do I read multiple csv files into R and ensure that all columns are the same data?
I am trying to merge several large datafiles into one usable dataframe into R using lapply to read in the files. That part works just fine, however, one of the files has changed a single column from integer data to character data. Is there a way to read over all of the files and force the single column in the single file to be the same data type? I have found a workaround through trial and error, but a single solution would be great. For reference, this is the ICEWS event data found on the Harvard dataverse.
list_file<-list.files(pattern="*.csv") %>%
lapply(read.csv,stringsAsFactors=F) %>%
bind_rows
head(list_file)
These two separate code blocks posted here work independently, but I would ideally like the as.integer command to be integrated into the lapply process so that I don't have to repeat the process of reading in and merging the data from the files. Below is simply the work around that I have used.
list_file1<-list.files(pattern="*.csv") %>%
lapply(read.csv,stringsAsFactors=F) %>%
bind_rows
head(list_file1)
class(list_file1$CAMEO.Code)
list_file1$CAMEO.Code<-as.integer(list_file1$CAMEO.Code)
class(list_file1$CAMEO.Code)
head(list_file1$CAMEO.Code)
Solution 1:[1]
You could do something like this:
bind_rows(
lapply(list.files(pattern="*.csv"), function(f) {
read.csv(f,stringsAsFactors=F) %>%
mutate(CAMEO.Code = as.integer(CAMEO.Code))
})
)
Solution 2:[2]
You could also try via map() function and the col_types possibility in readr::read_csv
list.files(pattern="*.csv", full.names = T) %>%
purrr::map_dfr(~.x %>%
readr::read_csv(col_types = cols(.default = "?", CAMEO.Code = "i")))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | langtang |
| Solution 2 | Julian |
