'Addition corresponding to all possible combination of two columns based on a criteria

I have the following input

| ID | Column1 | Column2 | Value |
|:-- |:-------:|:-------:| -----:|
| A  |    x    |    a    |   4   |
| A  |    x    |    b    |   1   |
| A  |    y    |    a    |   5   |
| A  |    y    |    b    |   6   |
| B  |    u    |    a    |   9   |
| B  |    u    |    b    |   2   |
| B  |    v    |    a    |   7   |
| B  |    v    |    b    |   4   |
| B  |    w    |    a    |   1   |
| B  |    w    |    b    |   3   |

I am looking to get the following output where values in output would be addition of possible combinations of column 1 and column 2 in input for every ID, similar to below

| ID |     Column1    |    Value    |
|:-- |:--------------:| -----------:|
| A  |    xa+ya       |    4+5      |
| A  |    xa+yb       |    4+6      |
| A  |    xb+ya       |    1+5      |
| A  |    xb+yb       |    1+6      |
| B  |    ua+va+wa    |    9+7+1    |
| B  |    ua+va+wb    |    9+7+3    |
| B  |    ua+vb+wa    |    9+4+1    |
| B  |    ua+vb+wb    |    9+4+3    |
| B  |    ub+va+wa    |    2+7+1    |
| B  |    ub+va+wb    |    2+7+3    |
| B  |    ub+vb+wa    |    2+4+1    |
| B  |    ub+vb+wb    |    2+4+3    |


Solution 1:[1]

The logic for combinations is not very clear. Maybe this helps - paste (str_c from stringr), the 'Column1', 'Column2' to a single column, then grouped by 'ID', summarise the 'ColumnA' with combnations to paste the elements, while getting the sum from Value column, then filter out the combinations that are not needed based on the unique values in 'Column1'

library(dplyr)
library(stringr)
df1 %>% 
  mutate(ColumnA = str_c(Column1, Column2)) %>%
  group_by(ID) %>% 
  summarise(ColumnA = combn(ColumnA, n_distinct(Column1), 
     FUN = str_c, collapse = "+"), 
  Value = combn(Value, n_distinct(Column1), FUN = sum), .groups = 'drop')  %>% 
  mutate(new1 = str_remove_all(ColumnA, ".(?=(\\+|$))|\\+"), 
  new2 = str_replace_all(new1, "(.)\\1+", "\\1")) %>%
  filter(nchar(new2) == nchar(new1)) %>%
  select(ID, Column1 = ColumnA, Value)

-output

# A tibble: 12 × 3
   ID    Column1  Value
   <chr> <chr>    <int>
 1 A     xa+ya        9
 2 A     xa+yb       10
 3 A     xb+ya        6
 4 A     xb+yb        7
 5 B     ua+va+wa    17
 6 B     ua+va+wb    19
 7 B     ua+vb+wa    14
 8 B     ua+vb+wb    16
 9 B     ub+va+wa    10
10 B     ub+va+wb    12
11 B     ub+vb+wa     7
12 B     ub+vb+wb     9

Or another option is

library(tidyr)
df1 %>% 
  mutate(ColumnA = str_c(Column1, Column2)) %>% 
  group_by(ID) %>% 
  group_modify(~ bind_cols(
     split(.x$ColumnA, .x$Column1) %>%
     expand.grid %>% 
     unite(ColumnA, everything(), sep = "+"), 
    Value = split(.x$Value, .x$Column1) %>%
           expand.grid %>% 
           rowSums(.)) ) %>% 
    ungroup

-output

# A tibble: 12 × 3
   ID    ColumnA  Value
   <chr> <chr>    <dbl>
 1 A     xa+ya        9
 2 A     xb+ya        6
 3 A     xa+yb       10
 4 A     xb+yb        7
 5 B     ua+va+wa    17
 6 B     ub+va+wa    10
 7 B     ua+vb+wa    14
 8 B     ub+vb+wa     7
 9 B     ua+va+wb    19
10 B     ub+va+wb    12
11 B     ua+vb+wb    16
12 B     ub+vb+wb     9

data

df1 <- structure(list(ID = c("A", "A", "A", "A", "B", "B", "B", "B", 
"B", "B"), Column1 = c("x", "x", "y", "y", "u", "u", "v", "v", 
"w", "w"), Column2 = c("a", "b", "a", "b", "a", "b", "a", "b", 
"a", "b"), Value = c(4L, 1L, 5L, 6L, 9L, 2L, 7L, 4L, 1L, 3L)),
 row.names = c(NA, 
-10L), class = "data.frame")

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1