'How to remove substring starting with RT and ends with ":"
I have a dataset with a column consisting of tweets. Some tweets are retweets, which start with RT @username: ..... I would like to remove this part of the string while keeping the string that comes after it.
See the example below:
stringsExample <- c("RT @WhiteHouse: Yesterday, President Biden...",
"During World War II...")
The results I want are: Yesterday, President Biden... During World War II...
Solution 1:[1]
Replace anything that starts (regex ^) with "RT" followed by one or more characters (regex .+?), until a colon ":" with empty space "".
gsub("^RT.+?: ", "", stringsExample)
[1] "Yesterday, President Biden..." "During World War II..."
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | benson23 |
