'How to remove substring starting with RT and ends with ":"
I have a dataset with a column consisting of tweets. Some tweets are retweets, which start with RT @username: ....
. I would like to remove this part of the string while keeping the string that comes after it.
See the example below:
stringsExample <- c("RT @WhiteHouse: Yesterday, President Biden...",
"During World War II...")
The results I want are: Yesterday, President Biden...
During World War II...
Solution 1:[1]
Replace anything that starts (regex ^
) with "RT" followed by one or more characters (regex .+?
), until a colon ":" with empty space "".
gsub("^RT.+?: ", "", stringsExample)
[1] "Yesterday, President Biden..." "During World War II..."
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | benson23 |