'Remove all characters in a string after the last ocurrence of a pattern in R

I want to remove all the characters after the last ocurrence of a specific pattern in a string, in R.

For example:

string = "asdsads dfdsfd>x 442 /<sdasvre (geqwe) ge ge ge regthyty "

I would like to remove everything after the last ocurrence of the pattern "ge" and end up with:

"asdsads dfdsfd>x 442 /<sdasvre (geqwe) ge ge ge".



Solution 1:[1]

You can use a capture group to capture all strings before the last "ge" (^(.*ge)), and replace that whole thing with that capture group (\\1).

sub('^(.*ge).+$', '\\1', string)
[1] "asdsads dfdsfd>x 442 /<sdasvre (geqwe) ge ge ge"

Solution 2:[2]

You could use a negative lookahead here:

string <- "asdsads dfdsfd>x 442 /<sdasvre (geqwe) ge ge ge regthyty "
output <- sub("\\bge (?!.*\\bge\\b).*", "ge", string, perl=TRUE)
output

[1] "asdsads dfdsfd>x 442 /<sdasvre (geqwe) ge ge ge"

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Tim Biegeleisen