'Numerical find and replace issue

I have a column with a list of stings as below

"Plate 2 Day 2 - 220304_Plate-2_Day-2-Well-number-001_Processed_PrintToExcel.xlsx"

Well numbers go from 1-56 and Day and plates change as well - there is thousands of entries in this dataset.

I want to change the well numbers from 002 - 009, 003-017, 004-0025, 005-33, 006-41 and so on.

If I use

   df_find_replace <- df %>% mutate(col1 = str_replace_all(col1,pattern = "002", replacement = "009"))

when I go to change well 009 to 018

df_find_replace <- df %>% mutate(col1 = str_replace_all(col1,pattern = "009", replacement = "018"))

I'll end up changing the well that was 002 to 018.

If this code accepted a pipe I should be able to avoid this? As each finds a replace would be working on the original df?

Any help would be greatly appreciated!



Solution 1:[1]

I don't understand the logic for the changes (you didn't answer @MrFlick's questions), but this should get you started:

library(tidyverse)

df <- tibble(col1 = "Plate 2 Day 2 - 220304_Plate-2_Day-2-Well-number-001_Processed_PrintToExcel.xlsx")

df %>% 
    extract(col1, 
            into = c("plate_no", "day_no", "rest"),
            regex = "^Plate (\\d+) Day (\\d+) - (.*)") %>% 
    mutate(plate_no = case_when(plate_no == "2" ~ "009",
                                plate_no == "9" ~ "018")) %>%
    mutate(new_col1 = paste0("Plate ", plate_no, " Day ", day_no, " - ", rest))

Use a regular expression (regex) to split the string into multiple columns. Then do your changes. Whatever logic you use, do it here and it won't overwrite itself. Then, paste everything back together again. If you want help writing the regex, then you'll need to specify the logic you want.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Michael Dewar