'How to pass column names in function to carry out filter, select and sort actions within a function in R
I am trying to carry out the filter, select and arrange actions on a data frame by defining the function.
Below is the code i am trying to replicate by a function:
mtcars %>%
filter(disp > 150) %>%
select(disp, hp) %>%
arrange(hp)
The function i have created is as below:
process_data <- function(df, col_1, col_2){
df %>% filter(col_1 > 150) %>%
select(col_1, col_2)
}
process_data(df = mpg, col_1 = "disp", col_2 = "hp")
However when i try to execute the i get the below error:
Error: Can't subset columns that don't exist.
x Column disp doesn't exist.
Tried multiple ways of passing the column name, but it isnt working.
Solution 1:[1]
We need to convert to symbol and evaluate (!!) if we pass string as input
library(dplyr)
process_data <- function(df, col_1, col_2){
col_1 <- rlang::ensym(col_1)
col_2 <- rlang::ensym(col_2)
df %>% filter(!!col_1 > 150) %>%
select(!!col_1, !!col_2)
}
-testing
process_data(df = mtcars, col_1 = "disp", col_2 = "hp")
disp hp
Mazda RX4 160.0 110
Mazda RX4 Wag 160.0 110
Hornet 4 Drive 258.0 110
Hornet Sportabout 360.0 175
Valiant 225.0 105
Duster 360 360.0 245
Merc 280 167.6 123
Merc 280C 167.6 123
Merc 450SE 275.8 180
Merc 450SL 275.8 180
Merc 450SLC 275.8 180
Cadillac Fleetwood 472.0 205
Lincoln Continental 460.0 215
Chrysler Imperial 440.0 230
Dodge Challenger 318.0 150
AMC Javelin 304.0 150
Camaro Z28 350.0 245
Pontiac Firebird 400.0 175
Ford Pantera L 351.0 264
Maserati Bora 301.0 335
Solution 2:[2]
Another solution using any_of:
process_data <- function(df, col_1, col_2){
df %>%
filter(col_1 > 150) %>%
select(any_of(c(col_1, col_2)))
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | akrun |
| Solution 2 | flopeko |
