'How to sequentially replace the subsequences of a string with another string

I have a main string that looks like this:

my_main <- "ABCDEFGHIJ"

What I want to do is to sequentially replace at every position with another string:

my_insert <- "xxx" # the length could be varied from 1 up to length of my_main

The final result is a vector of strings that contain these:

xxxDEFGHIJ
AxxxEFGHIJ
ABxxxFGHIJ
ABCxxxGHIJ
ABCDxxxHIJ
ABCDExxxIJ
ABCDEFxxxJ
ABCDEFGxxx

If my_insert <- "xxxxxxxxxx", then the final output is vector of just 1 string xxxxxxxxxx.

How can I achieve that?



Solution 1:[1]

my_main <- "ABCDEFGHIJ"
my_insert <- "xxx"
x = c()
for(i in 1:(nchar(my_main)-nchar(my_insert)+1)){
    s = my_main
    substr(s, i, i+nchar(my_insert)) = my_insert
    x[i] = s
}
x
#[1] "xxxDEFGHIJ" "AxxxEFGHIJ" "ABxxxFGHIJ" "ABCxxxGHIJ" "ABCDxxxHIJ"
#[6] "ABCDExxxIJ" "ABCDEFxxxJ" "ABCDEFGxxx"

Solution 2:[2]

Here is a general code to do that. You can also change my_insert and try :

my_main <- "ABCDEFGHIJ"
my_insert <- "xxx"

x <- strsplit(my_main, "")[[1]]
y <- nchar(my_insert)

a <- strsplit(my_insert,"")[[1]]

sapply(1:(length(x) - y + 1), function(i) {
  z <- x
  z[i:(i + y - 1)] <- a
  paste(z, collapse = '')
})

Solution 3:[3]

Using `substr<-()`.

sapply(seq_len(nchar(my_main) - nchar(my_insert) + 1L), \(i) 
       `substr<-`(my_main, i, i + 3L, my_insert))
# [1] "xxxDEFGHIJ" "AxxxEFGHIJ"
# [3] "ABxxxFGHIJ" "ABCxxxGHIJ"
# [5] "ABCDxxxHIJ" "ABCDExxxIJ"
# [7] "ABCDEFxxxJ" "ABCDEFGxxx"

Faster version:

vapply(seq_len(nchar(my_main) - nchar(my_insert) + 1L), \(i) 
       `substr<-`(my_main, i, i + 3L, my_insert), vector('character', 1L))

Data:

my_main <- "ABCDEFGHIJ"
my_insert <- "xxx"

Solution 4:[4]

Use regmatches to do the replacement, as it is vectorized.

my_main <- "ABCDEFGHIJ"
my_insert <- "xxx"
len_main <- nchar(my_main)
len_insert <- nchar(my_insert)
string <- rep(my_main, len_main - len_insert + 1) #repeat the string
len <- sequence(len_main - len_insert + 1) # obtain  the positions
attr(len, 'match.length') <- len_insert
regmatches(string, len) <- 'xxx' # replace
string

[1] "xxxDEFGHIJ" "AxxxEFGHIJ" "ABxxxFGHIJ" "ABCxxxGHIJ" "ABCDxxxHIJ" "ABCDExxxIJ"
[7] "ABCDEFxxxJ" "ABCDEFGxxx"

You could write a simple function to do this:

my_fun <- function(main, insert){
  len_main <- nchar(main)
  len_insert <- nchar(insert)
  size <- len_main - len_insert + 1
  string <- rep(main, size) #repeat the string
  len <- sequence(size) # obtain  the positions
  attr(len, 'match.length') <- rep(len_insert, size)
  regmatches(string, len) <- rep(insert, size)
  string
}

my_fun(c('ABCDE', 'ABCDEFGH'), c('xx', 'xxxxxx'))
[1] "xxCDE"    "AxxDE"    "ABxxE"    "ABCxx"    "xxxxxxGH" "AxxxxxxH" "ABxxxxxx"

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3
Solution 4