'How to get the number of elements separated by an underscore
strings<-c("A_A_A","B", "C_C_C_C", "D_D_D_D_D")
I have this vector of strings. I want to have the number of elements in each string so that to have:
[1] 3 1 4 5
Solution 1:[1]
stringr::str_count(c("A_A_A","B", "C_C_C_C", "D_D_D_D_D"), '_') + 1
Solution 2:[2]
For a base R option, we could use string replacement:
strings <- c("A_A_A","B", "C_C_C_C", "D_D_D_D_D")
nchar(strings) - nchar(gsub("_", "", strings, fixed=TRUE)) + 1
[1] 3 1 4 5
Solution 3:[3]
You can use lengths with strsplit.
strings <- c("A_A_A", "B", "C_C_C_C", "D_D_D_D_D")
lengths(strsplit(strings, split = "_"))
# [1] 3 1 4 5
Solution 4:[4]
Yet another possible solution:
library(tidyverse)
strings<-c("A_A_A","B", "C_C_C_C", "D_D_D_D_D")
strings %>%
str_remove_all("_") %>%
str_count
#> [1] 3 1 4 5
Solution 5:[5]
Another option with nchar + gsub
> nchar(gsub("[^_]", "", strings)) + 1
[1] 3 1 4 5
or
> nchar(gsub("_", "", strings))
[1] 3 1 4 5
Solution 6:[6]
If your structure is always like your sample data, that means 1 character seperated by one length seperator you can simply do
ceiling(nchar(strings)/2)
# [1] 3 1 4 5
or
nchar(strings)%/%2+1
# [1] 3 1 4 5
Solution 7:[7]
Another option is to specify the types of characters that you want to count. So, in this case, we can just count the alphabetic characters (i.e., "[[:alpha:]]"). Or if you also have numbers, then could use "[[:alnum:]]".
library(stringr)
str_count(strings, "[[:alpha:]]")
# [1] 3 1 4 5
Or in base R:
nchar(gsub("([^[:alpha:]])+", "", strings))
# [1] 3 1 4 5
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Baraliuh |
| Solution 2 | Tim Biegeleisen |
| Solution 3 | Maël |
| Solution 4 | PaulS |
| Solution 5 | ThomasIsCoding |
| Solution 6 | |
| Solution 7 | AndrewGB |
