'Concatenate 2+ columns into a new column, skip blanks in R
My current dataset looks something like this:
Col1 Col2 Col3
1 b c
2 c
3 B
I want to add a new column that returns everything in the row separated by commas, skipping blank cells. Example:
Col1 Col2 Col3 newCol
1 b c 1, b, c
2 c 2, c
3 B 3, B
I've tried df$newCol <- apply(df[,1:3], 1, paste, collapse=", "), but it returns the new column with extra commas:
newCol
1, b, c
2, , c
3, B,
Solution 1:[1]
Here we use unite function after replacing blank cells with NA.
At the end we replace back to blank cells:
- Main feature is to use
na.rmargument ofunite
library(dplyr)
library(tidyr)
df %>%
mutate(across(everything(), na_if,"")) %>%
unite(newCol, Col1:Col3, na.rm = TRUE, sep = ", ", remove = FALSE) %>%
mutate(across(everything(), ~replace_na(., ""))) %>%
relocate(newCol, .after = Col3)
Col1 Col2 Col3 newCol
1 1 b c 1, b, c
2 2 c 2, c
3 3 B 3, B
Solution 2:[2]
Using do.call with paste
df$newCol <- gsub("\\s+,\\s+", " ", trimws(do.call(paste,
c(df, sep=", ")), whitespace =",\\s*"))
Solution 3:[3]
A possible solution, when your empty cells are "":
df <- data.frame(
stringsAsFactors = FALSE,
Col1 = c(1L, 2L, 3L),
Col2 = c("b", "", "B"),
Col3 = c("c", "c", "")
)
df$newCol <- apply(df, 1, function(x) paste(x[x != ""], collapse=", "))
df
#> Col1 Col2 Col3 newCol
#> 1 1 b c 1, b, c
#> 2 2 c 2, c
#> 3 3 B 3, B
A possible solution, which uses na.omit when your empty cells are NA:
df <- data.frame(
stringsAsFactors = FALSE,
Col1 = c(1L, 2L, 3L),
Col2 = c("b", NA, "B"),
Col3 = c("c", "c", NA)
)
df$Col4 <- apply(df, 1, function(x) paste(na.omit(x), collapse = ", "))
df
#> Col1 Col2 Col3 Col4
#> 1 1 b c 1, b, c
#> 2 2 <NA> c 2, c
#> 3 3 B <NA> 3, B
Solution 4:[4]
Here is a data.table, do.call, and paste option:
library(data.table)
setDT(df)[, newCol := gsub("\\s+,\\s+", " ",
gsub(", $", "", do.call(paste, c(.SD, sep = ", "))))]
Output
Col1 Col2 Col3 newCol
1: 1 b c 1, b, c
2: 2 c 2, c
3: 3 B 3, B
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | TarJae |
| Solution 2 | akrun |
| Solution 3 | |
| Solution 4 | AndrewGB |
