'Using and pasting current column name into mutate case_when
My data looks like this:
dt <- structure(list(var1_dummy = c(0, 1, 0, 0, 0, 1, 0, 0, 1, 1),
var2_dummy = c(1, 0, 0, 0, 0, 1, 0, 0, 1, 1), var1_scale = c(NA,
3, NA, NA, NA, 3, NA, NA, 4, 4), var2_scale = c(3, NA, NA,
NA, NA, 2, NA, NA, 3, 5)), class = "data.frame", row.names = c(NA,
-10L))
var1_dummy var2_dummy var1_scale var2_scale
1 0 1 NA 3
2 1 0 3 NA
3 0 0 NA NA
4 0 0 NA NA
5 0 0 NA NA
6 1 1 3 2
7 0 0 NA NA
8 0 0 NA NA
9 1 1 4 3
10 1 1 4 5
I now want to mutate the variables with suffix "scale" in a case_when that evaluates the corresponding variable with suffix "dummy". (So manipulation of var1_scale should depend on var1_dummy etc.) The new version of var1-scale should be 0 if var1_dummy is 0, and should augment by 1 if var1_dummy is 1.
Note that I have many of such columns, so mutating every column individually should be avoided.
The variables to be mutated are in the following vector:
vars <- v(var1_scale, var2_scale)
Now I manage to do what I want with the good ol' loop:
for (var in vars) {
dummy <- gsub("scale", "dummy", var)
dt[, outlet] <- case_when(
dt[[outlet_expo]] == 0 ~ 0,
dt[[outlet_expo]] == 1 ~ dt[[outlet]] + 1)
}
However, I'd prefer a vectorised solution. Here's what I tried:
dt %>%
mutate(across(all_of(vars),
~ case_when(
!!as.symbol(gsub("scale", "dummy", as.name(cur_column()))) == 0 ~ 0,
!!as.symbol(gsub("scale", "dummy", as.name(cur_column()))) == 1 ~ . + 1))
... the idea being that I take the name of the current column, change it with gsub and then evaluate it as a column again. But cur_column does not seem to work inside a case-when.
Solution 1:[1]
with data.table you can do something like:
dummy_vars <- names(dt)[grep("dummy", names(dt))]
scale_vars <- names(dt)[grep("scale", names(dt))]
setDT(dt)[, (scale_vars) := map2(mget(dummy_vars), mget(scale_vars), ~ifelse(.x == 0, .x, .y + 1))]
or easier using the fact that when dummy col is 0 then scale one is NA:
scale_vars <- names(dt)[grep("scale", names(dt))]
setDT(dt)[, (scale_vars) := map(.SD, ~ifelse(is.na(.x), 0, .x + 1)), .SDcols = scale_vars]
Solution 2:[2]
a bit late, but this might help:
library(tidyverse)
dt %>%
mutate(across(ends_with("_scale"),
~case_when(
cur_data() %>% select(all_of(gsub("scale","dummy",cur_column()))) == 0 ~ 0,
cur_data() %>% select(all_of(gsub("scale","dummy",cur_column()))) == 1 ~ . + 1)
))
Rather than using as.symbol, I'd simply use cur_data() and then select the dummy column from it again using a gsub() within the select() function.
Also remember that cur_column() already gives you a character vector of the name of the column.
You could consider adding a pull() to the end to really make sure you have a vector, but I think it works without it as well.
Hope this helps!
Solution 3:[3]
You can add two similar sized dataframes directly, just change the NA values to 0 in scale columns.
dummy_cols <- grep('dummy', names(dt))
scale_cols <- grep('scale', names(dt))
dt[scale_cols] <- dt[dummy_cols] + replace(dt[scale_cols], is.na(dt[scale_cols]), 0)
dt
# var1_dummy var2_dummy var1_scale var2_scale
#1 0 1 0 4
#2 1 0 4 0
#3 0 0 0 0
#4 0 0 0 0
#5 0 0 0 0
#6 1 1 4 3
#7 0 0 0 0
#8 0 0 0 0
#9 1 1 5 4
#10 1 1 5 6
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | det |
| Solution 2 | Moritz Schwarz |
| Solution 3 | Ronak Shah |
