'Using functions of multiple columns in a dplyr mutate_at call
I'd like to use dplyr's mutate_at function to apply a function to several columns in a dataframe, where the function inputs the column to which it is directly applied as well as another column in the dataframe.
As a concrete example, I'd look to mutate the following dataframe
# Example input dataframe
df <- data.frame(
x = c(TRUE, TRUE, FALSE),
y = c("Hello", "Hola", "Ciao"),
z = c("World", "ao", "HaOlam")
)
with a mutate_at call that looks similar to this
df %>%
mutate_at(.vars = vars(y, z),
.funs = ifelse(x, ., NA))
to return a dataframe that looks something like this
# Desired output dataframe
df2 <- data.frame(x = c(TRUE, TRUE, FALSE),
y_1 = c("Hello", "Hola", NA),
z_1 = c("World", "ao", NA))
The desired mutate_at call would be similar to the following call to mutate:
df %>%
mutate(y_1 = ifelse(x, y, NA),
z_1 = ifelse(x, z, NA))
I know that this can be done in base R in several ways, but I would specifically like to accomplish this goal using dplyr's mutate_at function for the sake of readability, interfacing with databases, etc.
Below are some similar questions asked on stackoverflow which do not address the question I posed here:
adding multiple columns in a dplyr mutate call
dplyr::mutate to add multiple values
Use of column inside sum() function using dplyr's mutate() function
Solution 1:[1]
This was answered by @eipi10 in @eipi10's comment on the question, but I'm writing it here for posterity.
The solution here is to use:
df %>%
mutate_at(.vars = vars(y, z),
.funs = list(~ ifelse(x, ., NA)))
You can also use the new across() function with mutate(), like so:
df %>%
mutate(across(c(y, z), ~ ifelse(x, ., NA)))
The use of the formula operator (as in ~ ifelse(...)) here indicates that ifelse(x, ., NA) is an anonymous function that is being defined within the call to mutate_at().
This works similarly to defining the function outside of the call to mutate_at(), like so:
temp_fn <- function(input) ifelse(test = df[["x"]],
yes = input,
no = NA)
df %>%
mutate_at(.vars = vars(y, z),
.funs = temp_fn)
Note on syntax changes in dplyr: Prior to dplyr version 0.8.0, you would simply write .funs = funs(ifelse(x, . , NA)), but the funs() function is being deprecated and will soon be removed from dplyr.
Solution 2:[2]
To supplement the previous response, if you wanted mutate_at() to add new variables (instead of replacing), with names such as z_1 and y_1 as in the original question, you just need to:
- dplyr >=1 with
across(): add.names="{.col}_1", or alternatively uselist('1'=~ifelse(x, ., NA)(back ticks!) - dplyr [0.8, 1[: use
list('1'=~ifelse(x, ., NA) - dplyr <0.8: use
funs('1'=ifelse(x, ., NA)
library(tidyverse)
df <- data.frame(
x = c(TRUE, TRUE, FALSE),
y = c("Hello", "Hola", "Ciao"),
z = c("World", "ao", "HaOlam")
)
## Version >=1
df %>%
mutate(across(c(y, z),
list(~ifelse(x, ., NA)),
.names="{.col}_1"))
#> x y z y_1 z_1
#> 1 TRUE Hello World Hello World
#> 2 TRUE Hola ao Hola ao
#> 3 FALSE Ciao HaOlam <NA> <NA>
## 0.8 - <1
df %>%
mutate_at(.vars = vars(y, z),
.funs = list(`1`=~ifelse(x, ., NA)))
#> x y z y_1 z_1
#> 1 TRUE Hello World Hello World
#> 2 TRUE Hola ao Hola ao
#> 3 FALSE Ciao HaOlam <NA> <NA>
## Before 0.8
df %>%
mutate_at(.vars = vars(y, z),
.funs = funs(`1`=ifelse(x, ., NA)))
#> Warning: `funs()` is deprecated as of dplyr 0.8.0.
#> Please use a list of either functions or lambdas:
#>
#> # Simple named list:
#> list(mean = mean, median = median)
#>
#> # Auto named with `tibble::lst()`:
#> tibble::lst(mean, median)
#>
#> # Using lambdas
#> list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_warnings()` to see where this warning was generated.
#> x y z y_1 z_1
#> 1 TRUE Hello World Hello World
#> 2 TRUE Hola ao Hola ao
#> 3 FALSE Ciao HaOlam <NA> <NA>
Created on 2020-10-03 by the reprex package (v0.3.0)
For more details and tricks, see: Create new variables with mutate_at while keeping the original ones
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 |
