'Is there a loop in R to open multiple .CSV from a folder, apply a change (e.g. remove specific columns) and save it as a .txt with the same name?

I am trying to create a loop that:

opens all .csv in my folder (separately)
removes the columns 1 to 4 and 7 to 13 from those .csv (separately)
saves the edited file (as .csv or as .txt) (separately)

I have tried to use the following function, but it merges all the data:

library(tidyverse)

files <- list.files(".", pattern = ".csv")

``dat <- files %>% 
  map_dfr(
    ~ read_csv(.x) %>% 
      slice(7:nrow(.))
  ) %>% 
  select(-c(1:4, 7:13))``

I also tried using: file_paths <- fs::dir_ls("/directory", pattern = ".csv") to open the folders;

but I get Error: [ENOENT] Failed to search directory '/Test2': no such file or directory. (I set that place as the working directory already and just copy-pasted the path) -I read this might be due to an outdated Windows Version. I'm doing it manually and I know there must be a faster way...

I think this should be easily doable, but I cannot find the answer anywhere. Thank you very much.

An example of the csv is:

ID,Timestamp,Rec,RTC_TC,A0,A1,Heat,Thot1,Thot2,Tcold1,Tcold2,DT,group

P21,2021/11/26 00:09:00,01,20.25,18,20,off,20.88,21.06,20.25,20.5,2021-11-25T16:09:00.00Z,0_1st

P21,2021/11/26 00:09:01,01,20.25,18,20,off,20.81,21.06,20.25,20.5,2021-11-25T16:09:01.00Z,0_1st

P21,2021/11/26 00:09:01,02,20.25,18,20,off,20.81,21.06,20.25,20.5,2021-11-25T16:09:01.01Z,0_1st

P21,2021/11/26 00:09:01,03,20.25,18,20,off,20.81,21.06,20.25,20.5,2021-11-25T16:09:01.02Z,0_1st

Rows: 7496 Columns: 13
-- Column specification -------------------------------------------------------------------------------------- Delimiter: "," chr (5): treeID, Timestamp, Rec, Heat, group dbl (7): RTC_TC, A0, A1, Thot1, Thot2, Tcold1, Tcold2 dttm (1): DT

i Use `spec()` to retrieve the full column specification for this data.
i Specify the column types or set `show_col_types = FALSE` to quiet this message.
structure(list(ID = c("P21", "P21", "P21", "P21", "P21", 
"P21"), Timestamp = c("2021/11/26 06:39:00", "2021/11/26 06:39:00", 
"2021/11/26 06:39:00", "2021/11/26 06:39:00", "2021/11/26 06:39:00", 
"2021/11/26 06:39:00"), Rec = c("01", "02", "03", "04", "05", 
"06"), RTC_TC = c(19.75, 19.75, 19.75, 19.75, 19.75, 19.75), 
    A0 = c(-8, -8, -2, -2, -2, -2), A1 = c(-4, -4, 0, 0, 0, 0
    ), Heat = c("off", "off", "off", "off", "off", "off"), Thot1 = c(19.88, 
    19.88, 19.88, 19.81, 19.81, 19.81), Thot2 = c(20.19, 20.19, 
    20.19, 20.19, 20.19, 20.19), Tcold1 = c(20, 20, 19.5, 19.5, 
    19.5, 19.5), Tcold2 = c(20, 20, 20, 20, 20, 20), DT = structure(c(1637879940, 
    1637879940.01, 1637879940.02, 1637879940.03, 1637879940.04, 
    1637879940.05), tzone = "UTC", class = c("POSIXct", "POSIXt"
    )), group = c("6_2nd", "6_2nd", "6_2nd", "6_2nd", "6_2nd", 
    "6_2nd")), row.names = c(NA, -6L), class = c("tbl_df", "tbl", 
"data.frame"), problems = <pointer: 0x00000000176c1c30>)

r loops csv import

Solution 1:^[1]

Here I suggest using purrr::walk() which is designed for iteratively applying a function for it's side effect (e.g. writing a file) without directly creating an output.

This function will read in each file drop the columns to drop you specify and then write it out. There are a few options in the function that you can modify to suit. First, it has no out directory by default but you could set that to wherever you want to save these outputs (e.g. "./out/"). It also automatically prepends "new_" to each file name to distinguish from the original. You can change this to NULL to just overwrite for example.

For the read/write, I like {vroom} for speed and the API, but you can use base r or other options here too according to your preference.

library(tidyverse)
library(vroom)
# read file list
files <- list.files(".", pattern = ".csv")
# specify columns to drop
drop_cols <- c(1:4, 7:13)

# function to read, trim, write
f <- function(file, drop_cols, outdir = NULL, prefix = "new_") {
  if (!is.null(outdir)) {
    dir.create(outdir)  
  }
  x <- vroom(file) %>% select(-all_of(drop_cols))
  nm <- tools::file_path_sans_ext(file)
  x %>% vroom_write(paste0(outdir, prefix, nm, ".csv"), delim = ",")
}
# iterate over files
walk(files, ~f(.x, drop_cols = drop_cols))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1

'Is there a loop in R to open multiple .CSV from a folder, apply a change (e.g. remove specific columns) and save it as a .txt with the same name?

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]