'Replace every sequence of letters with a random word

I created a list of random words:

library(OpenRepGrid)
list_of_words <- randomWords(100)
list_of_words <- gsub("[^A-Za-z ]", "", list_of_words)
list_of_words <- list_of_words[nchar(list_of_words) %in% 4:6]
list_of_words <- list_of_words[!(duplicated(list_of_words)|duplicated(list_of_words, fromLast=TRUE))]

And I have a string as follows:

dat_string <- "Code bla-group Description bla-groep somecoëfficiënt\nP1 building 0,325\nN2111 veggies 0,387"

I would like to replace all sets of consecutive letters (Code, bla, Description, ...) with a random word of the list_of_words.

I thought of doing:

dat_string <- gsub("[:alpha:]",sample(list_of_words),dat_string) 

But the output is a bit unexpected;

"Code bHarryHarry-grouHarry DescriHarrytion bHarryHarry-groeHarry somecoëfficiënt\nP1 buiHarryding 0,325\nN2111 veggies 0,387"

Could anyone explain to me what I am doing wrong here?



Solution 1:[1]

You can use

library(stringr)
str_replace_all(dat_string, "\\p{L}+", function(x) sample(list_of_words, 1))

Here, \p{L}+ matches one or more Unicode letters (thus matching any word) and then the word is replaced by a random element from the list_of_words character vector.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Wiktor Stribiżew