'How to sequentially replace the subsequences of a string with another string
I have a main string that looks like this:
my_main <- "ABCDEFGHIJ"
What I want to do is to sequentially replace at every position with another string:
my_insert <- "xxx" # the length could be varied from 1 up to length of my_main
The final result is a vector of strings that contain these:
xxxDEFGHIJ
AxxxEFGHIJ
ABxxxFGHIJ
ABCxxxGHIJ
ABCDxxxHIJ
ABCDExxxIJ
ABCDEFxxxJ
ABCDEFGxxx
If my_insert <- "xxxxxxxxxx", then the final output is vector of just 1 string xxxxxxxxxx.
How can I achieve that?
Solution 1:[1]
my_main <- "ABCDEFGHIJ"
my_insert <- "xxx"
x = c()
for(i in 1:(nchar(my_main)-nchar(my_insert)+1)){
s = my_main
substr(s, i, i+nchar(my_insert)) = my_insert
x[i] = s
}
x
#[1] "xxxDEFGHIJ" "AxxxEFGHIJ" "ABxxxFGHIJ" "ABCxxxGHIJ" "ABCDxxxHIJ"
#[6] "ABCDExxxIJ" "ABCDEFxxxJ" "ABCDEFGxxx"
Solution 2:[2]
Here is a general code to do that. You can also change my_insert and try :
my_main <- "ABCDEFGHIJ"
my_insert <- "xxx"
x <- strsplit(my_main, "")[[1]]
y <- nchar(my_insert)
a <- strsplit(my_insert,"")[[1]]
sapply(1:(length(x) - y + 1), function(i) {
z <- x
z[i:(i + y - 1)] <- a
paste(z, collapse = '')
})
Solution 3:[3]
Using `substr<-()`.
sapply(seq_len(nchar(my_main) - nchar(my_insert) + 1L), \(i)
`substr<-`(my_main, i, i + 3L, my_insert))
# [1] "xxxDEFGHIJ" "AxxxEFGHIJ"
# [3] "ABxxxFGHIJ" "ABCxxxGHIJ"
# [5] "ABCDxxxHIJ" "ABCDExxxIJ"
# [7] "ABCDEFxxxJ" "ABCDEFGxxx"
Faster version:
vapply(seq_len(nchar(my_main) - nchar(my_insert) + 1L), \(i)
`substr<-`(my_main, i, i + 3L, my_insert), vector('character', 1L))
Data:
my_main <- "ABCDEFGHIJ"
my_insert <- "xxx"
Solution 4:[4]
Use regmatches to do the replacement, as it is vectorized.
my_main <- "ABCDEFGHIJ"
my_insert <- "xxx"
len_main <- nchar(my_main)
len_insert <- nchar(my_insert)
string <- rep(my_main, len_main - len_insert + 1) #repeat the string
len <- sequence(len_main - len_insert + 1) # obtain the positions
attr(len, 'match.length') <- len_insert
regmatches(string, len) <- 'xxx' # replace
string
[1] "xxxDEFGHIJ" "AxxxEFGHIJ" "ABxxxFGHIJ" "ABCxxxGHIJ" "ABCDxxxHIJ" "ABCDExxxIJ"
[7] "ABCDEFxxxJ" "ABCDEFGxxx"
You could write a simple function to do this:
my_fun <- function(main, insert){
len_main <- nchar(main)
len_insert <- nchar(insert)
size <- len_main - len_insert + 1
string <- rep(main, size) #repeat the string
len <- sequence(size) # obtain the positions
attr(len, 'match.length') <- rep(len_insert, size)
regmatches(string, len) <- rep(insert, size)
string
}
my_fun(c('ABCDE', 'ABCDEFGH'), c('xx', 'xxxxxx'))
[1] "xxCDE" "AxxDE" "ABxxE" "ABCxx" "xxxxxxGH" "AxxxxxxH" "ABxxxxxx"
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | |
| Solution 3 | |
| Solution 4 |
