'remove blanks from strsplit in R
> dc1
V1 V2
1 20140211-0100 |Box
2 20140211-1782 |Office|Ball
3 20140211-1783 |Office
4 20140211-1784 |Office
5 20140221-0756 |Box
6 20140203-0418 |Box
> strsplit(as.character(dc1[,2]),"^\\|")
[[1]]
[1] "" "Box"
[[2]]
[1] "" "Office" "Ball"
[[3]]
[1] "" "Office"
[[4]]
[1] "" "Office"
[[5]]
[1] "" "Box"
[[6]]
[1] "" "Box"
How do i remove the blank ("") from strsplit results.The result should look like:
[[1]] [1] "Box"
[[2]]
[1] "Office" "Ball"
Solution 1:[1]
You can check use lapply on your list. I changed the definition of your strsplit to match your intended output.
dc1 <- read.table(text = 'V1 V2
1 20140211-0100 |Box
2 20140211-1782 |Office|Ball
3 20140211-1783 |Office
4 20140211-1784 |Office
5 20140221-0756 |Box
6 20140203-0418 |Box', header = TRUE)
out <- strsplit(as.character(dc1[,2]),"\\|")
> lapply(out, function(x){x[!x ==""]})
[[1]]
[1] "Box"
[[2]]
[1] "Office" "Ball"
[[3]]
[1] "Office"
[[4]]
[1] "Office"
[[5]]
[1] "Box"
[[6]]
[1] "Box"
Solution 2:[2]
I do not have a global solution, but for your example you could try :
strsplit(sub("^\\|", "", as.character(dc1[,2])),"\\|")
It removes the first | (this is what the regex "^\\|" says), which is the reason for the "", before performing the split.
Solution 3:[3]
You could use:
library(stringr)
str_extract_all(dc1[,2], "[[:alpha:]]+")
[[1]]
[1] "Box"
[[2]]
[1] "Office" "Ball"
[[3]]
[1] "Office"
[[4]]
[1] "Office"
[[5]]
[1] "Box"
[[6]]
[1] "Box"
Solution 4:[4]
In this case, you can just remove the first element of each vector by calling "[" in sapply
> sapply(strsplit(as.character(dc1[,2]), "\\|"), "[", -1)
# [[1]]
# [1] "Box"
# [[2]]
# [1] "Office" "Ball"
# [[3]]
# [1] "Office"
# [[4]]
# [1] "Office"
# [[5]]
# [1] "Box"
# [[6]]
# [1] "Box"
Solution 5:[5]
Another method uses nzchar() after unlisting the result of strsplit():
out <- unlist(strsplit(as.character(dc1[,2]),"\\|"))
out[nzchar(x=out)] # removes the extraneous "" marks
Solution 6:[6]
library("stringr")
lapply(str_split(dc1$V2, "\\|"), function(x) x[-1])
[[1]]
[1] "Box"
[[2]]
[1] "Office" "Ball"
[[3]]
[1] "Office"
[[4]]
[1] "Office"
[[5]]
[1] "Box"
[[6]]
[1] "Box"
Solution 7:[7]
This post is cold but if this helps someone:
strsplit(as.character(dc1[,2]),"^\\|") %>%
lapply(function(x){paste0(x, collapse="")})
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | jdharrison |
| Solution 2 | Math |
| Solution 3 | akrun |
| Solution 4 | |
| Solution 5 | lawyeR |
| Solution 6 | dondapati |
| Solution 7 | Vasco Pereira |
