'How to get the number of elements separated by an underscore

strings<-c("A_A_A","B", "C_C_C_C", "D_D_D_D_D")

I have this vector of strings. I want to have the number of elements in each string so that to have:

[1] 3  1  4  5
r


Solution 1:[1]

stringr::str_count(c("A_A_A","B", "C_C_C_C", "D_D_D_D_D"), '_') + 1

Solution 2:[2]

For a base R option, we could use string replacement:

strings <- c("A_A_A","B", "C_C_C_C", "D_D_D_D_D")
nchar(strings) - nchar(gsub("_", "", strings, fixed=TRUE)) + 1

[1] 3 1 4 5

Solution 3:[3]

You can use lengths with strsplit.

strings <- c("A_A_A", "B", "C_C_C_C", "D_D_D_D_D")

lengths(strsplit(strings, split = "_"))
# [1] 3 1 4 5

Solution 4:[4]

Yet another possible solution:

library(tidyverse)

strings<-c("A_A_A","B", "C_C_C_C", "D_D_D_D_D")

strings %>% 
  str_remove_all("_") %>% 
  str_count

#> [1] 3 1 4 5

Solution 5:[5]

Another option with nchar + gsub

> nchar(gsub("[^_]", "", strings)) + 1
[1] 3 1 4 5

or

> nchar(gsub("_", "", strings))
[1] 3 1 4 5

Solution 6:[6]

If your structure is always like your sample data, that means 1 character seperated by one length seperator you can simply do

ceiling(nchar(strings)/2)

# [1] 3 1 4 5

or

nchar(strings)%/%2+1

# [1] 3 1 4 5

Solution 7:[7]

Another option is to specify the types of characters that you want to count. So, in this case, we can just count the alphabetic characters (i.e., "[[:alpha:]]"). Or if you also have numbers, then could use "[[:alnum:]]".

library(stringr)

str_count(strings, "[[:alpha:]]")
# [1] 3 1 4 5

Or in base R:

nchar(gsub("([^[:alpha:]])+", "", strings))
# [1] 3 1 4 5

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Baraliuh
Solution 2 Tim Biegeleisen
Solution 3 Maël
Solution 4 PaulS
Solution 5 ThomasIsCoding
Solution 6
Solution 7 AndrewGB