'Extract text after first upper case or space

How can I extract all text after first space in a column where data is something like this

structure(list(value = c("1.1.a Blue sea", "1.2.a Red ball")), row.names = c(NA, -2L), class =c("tbl_df", "tbl", "data.frame"))

so I get a new column with just

Blue sea
Red ball

r dplyr stringr

Solution 1:^[1]

You can use the following code to select all text after the first white space:

sub("^\\S+\\s+", '', df$value)

Output:

[1] "Blue sea" "Red ball"

You can just use this to create it as a new column:

library(dplyr)
df %>%
  mutate(new_value = sub("^\\S+\\s+", '', value))

Output:

# A tibble: 2 × 2
  value          new_value
  <chr>          <chr>    
1 1.1.a Blue sea Blue sea 
2 1.2.a Red ball Red ball

Solution 2:^[2]

You can use str_extract from the package stringr to extract anything that starts with an upper case letter ([[:upper:]]) followed by one or more characters (.+) until the end of a string ($).

library(stringr)

str_extract(df$value, "[[:upper:]].+$")

If you don't want to use regex, you can use str_split to split strings into two parts by an empty space.

str_split(df$value, " ", n = 2, simplify = T)[,2]

Output

The above two methods have the same output:

[1] "Blue sea" "Red ball"

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Quinten
Solution 2

'Extract text after first upper case or space

Solution 1:[1]

Solution 2:[2]

Output

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]