'Issues about the loop combining different data types in one column
I have more than 1000 csv files. I would like to combine in a single file, after running some processes. So, I used loop function as follow:
> setwd("C:/....") files <- dir(".", pattern = ".csv$") # Get the names
> of the all csv files in the current directory.
>
> for (i in 1:length(files)) { obj_name <- files %>% str_sub(end = -5)
> assign(obj_name[i], read_csv(files[i])) }
Until here, it works well.
I tried to concatenate the imported files into a list to manipulate them at once as follow:
command <- paste0("RawList <- list(", paste(obj_name, collapse = ","),
> ")") eval(parse(text = command))
>
> rm(i, obj_name, command, list = ls(pattern = "^g20")) Ref_com_list =
> list()
Until here, it still okay. But ...
> for (i in 1:length(RawList)) { df <- RawList[[i]] %>%
> pivot_longer(cols = -A, names_to = "B", values_to = "C") %>%
> mutate(time_sec = paste(YMD[i], B) %>% ymd_hms())%>%
> mutate(minute = format(as.POSIXct(B,format="%H:%M:%S"),"%M"))
>
> ...(some calculation)
> Ref_com_list [[i]] <- file_all }
>
> Ref_com_all <- do.call(rbind,Ref_com_list)
At that time, I got the error as follow:
> Error: Can't combine `A` <double> and `B` <datetime<UTC>>. Run
> `rlang::last_error()` to see where the error occurred.
If I run individual file, it work well. But if I run in for loop, the error showed up. Does anyone could tell me what the problem is?
Thanks a lot in advance.
Solution 1:[1]
There is a substantial scope for improvement in your code. Broadly speaking, if you are working in tidyverse you can pass multiple files to read_csv directly. Example:
# Generate some sample files
tmp_dir <- fs::path_temp("some_csv_files")
fs::dir_create(tmp_dir)
for (i in 1:100) {
readr::write_csv(mtcars, fs::file_temp(pattern = "cars",
tmp_dir = tmp_dir, ext = ".csv"))
}
# Actual file reading
dta_cars <- readr::read_csv(
file = fs::dir_ls(path = tmp_dir, glob = "*.csv"),
id = "file_path"
)
If you want to keep information on the file origination, using id = "file_path" in read_csv will store the path details in column. This is arguably more efficient than and less error-prone than:
for (i in 1:length(files)) { obj_name <- files %>% str_sub(end = -5) assign(obj_name[i], read_csv(files[i])) }
This is much cleaner and will be faster than growing object via loop. After you would progress with your transformations:
dta_cars %>% ...
Solution 2:[2]
try:
library(data.table)
files <- list.files(path = '.', full.names=T, pattern='csv')
files_open <- lapply(files, function(x) fread(x, ...)) # ... for arguments like sep, dec, etc...
big_file <- rbindlist(files_open)
fwrite(big_file, ...) # ... for arguments like sep, dec, path to save data, etc...
Solution 3:[3]
Now I found out the reason why it happened. There was another file which is not the same file name but with the same file type. So, the code read all the files, and provided the error. I am sorry I made you all confused. Thank you so much!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | |
| Solution 3 | Heiwa |
