'Extracting a letter and put it in a separated column in R
I have data set like this:
df<-data.frame(ID=(1:5), column1=c("AA","GG","AG","AA","AT"), column2=c("AA","GG","AG","AA","AT"), stringsAsFactors=FALSE)
df
ID column1 column2
1 AA AA
2 GG GG
3 AG AG
4 AA AA
5 AT AT
I want to separate each column into 2 letters so the output will look something like this:
ID column1.A column1.B column2.A column2.B
1 A A A A
2 G G G G
3 A G A G
4 A A A A
5 A T A T
Can you help me please?
Solution 1:[1]
Uisng strsplit.
cbind(df[1], do.call(cbind.data.frame, lapply(df[-1], function(x)
do.call(rbind, strsplit(x, '')))))
# ID column1.1 column1.2 column2.1 column2.2
# 1 1 A A A A
# 2 2 G G G G
# 3 3 A G A G
# 4 4 A A A A
# 5 5 A T A T
Solution 2:[2]
Yet another solution, tidyverse-based:
library(tidyverse)
df<-data.frame(ID=(1:5), column1=c("AA","GG","AG","AA","AT"), column2=c("AA","GG","AG","AA","AT"), stringsAsFactors=FALSE)
df %>%
mutate(
across(
starts_with("column"), ~
str_split(get(cur_column()), "(?<=[A-Z])(?=[A-Z])", simplify = T),
.names="{.col}_sep"), column1 = NULL, column2 = NULL)
#> ID column1_sep.1 column1_sep.2 column2_sep.1 column2_sep.2
#> 1 1 A A A A
#> 2 2 G G G G
#> 3 3 A G A G
#> 4 4 A A A A
#> 5 5 A T A T
Another possibility, based on a pivot_longer followed by a pivot_wider:
library(tidyverse)
df<-data.frame(ID=(1:5), column1=c("AA","GG","AG","AA","AT"), column2=c("AA","GG","AG","AA","AT"), stringsAsFactors=FALSE)
df %>%
pivot_longer(-ID) %>%
separate(value, into=LETTERS[1:2], sep= "(?<=[A-Z])(?=[A-Z])") %>%
pivot_wider(ID, names_from = "name", values_from = c(A,B),
names_glue = "{name}.{.value}") %>%
relocate(column1.B,.before=column2.A)
#> # A tibble: 5 × 5
#> ID column1.A column1.B column2.A column2.B
#> <int> <chr> <chr> <chr> <chr>
#> 1 1 A A A A
#> 2 2 G G G G
#> 3 3 A G A G
#> 4 4 A A A A
#> 5 5 A T A T
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | marc_s |
| Solution 2 |
