'Replace every sequence of letters with a random word
I created a list of random words:
library(OpenRepGrid)
list_of_words <- randomWords(100)
list_of_words <- gsub("[^A-Za-z ]", "", list_of_words)
list_of_words <- list_of_words[nchar(list_of_words) %in% 4:6]
list_of_words <- list_of_words[!(duplicated(list_of_words)|duplicated(list_of_words, fromLast=TRUE))]
And I have a string as follows:
dat_string <- "Code bla-group Description bla-groep somecoëfficiënt\nP1 building 0,325\nN2111 veggies 0,387"
I would like to replace all sets of consecutive letters (Code, bla, Description, ...) with a random word of the list_of_words.
I thought of doing:
dat_string <- gsub("[:alpha:]",sample(list_of_words),dat_string)
But the output is a bit unexpected;
"Code bHarryHarry-grouHarry DescriHarrytion bHarryHarry-groeHarry somecoëfficiënt\nP1 buiHarryding 0,325\nN2111 veggies 0,387"
Could anyone explain to me what I am doing wrong here?
Solution 1:[1]
You can use
library(stringr)
str_replace_all(dat_string, "\\p{L}+", function(x) sample(list_of_words, 1))
Here, \p{L}+ matches one or more Unicode letters (thus matching any word) and then the word is replaced by a random element from the list_of_words character vector.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Wiktor Stribiżew |
