'Filter data frame by character column name (in dplyr)

I have a data frame and want to filter it in one of two ways, by either column "this" or column "that". I would like to be able to refer to the column name as a variable. How (in dplyr, if that makes a difference) do I refer to a column name by a variable?

library(dplyr)
df <- data.frame(this = c(1, 2, 2), that = c(1, 1, 2))
df
#   this that
# 1    1    1
# 2    2    1
# 3    2    2
df %>% filter(this == 1)
#   this that
# 1    1    1

But say I want to use the variable column to hold either "this" or "that", and filter on whatever the value of column is. Both as.symbol and get work in other contexts, but not this:

column <- "this"
df %>% filter(as.symbol(column) == 1)
# [1] this that
# <0 rows> (or 0-length row.names)
df %>% filter(get(column) == 1)
# Error in get("this") : object 'this' not found

How can I turn the value of column into a column name?



Solution 1:[1]

I would steer clear of using get() all together. It seems like it would be quite dangerous in this situation, especially if you're programming. You could use either an unevaluated call or a pasted character string, but you'll need to use filter_() instead of filter().

df <- data.frame(this = c(1, 2, 2), that = c(1, 1, 2))
column <- "this"

Option 1 - using an unevaluated call:

You can hard-code y as 1, but here I show it as y to illustrate how you can change the expression values easily.

expr <- lazyeval::interp(quote(x == y), x = as.name(column), y = 1)
## or 
## expr <- substitute(x == y, list(x = as.name(column), y = 1))
df %>% filter_(expr)
#   this that
# 1    1    1

Option 2 - using paste() (and obviously easier):

df %>% filter_(paste(column, "==", 1))
#   this that
# 1    1    1

The main thing about these two options is that we need to use filter_() instead of filter(). In fact, from what I've read, if you're programming with dplyr you should always use the *_() functions.

I used this post as a helpful reference: character string as function argument r, and I'm using dplyr version 0.3.0.2.

Solution 2:[2]

Here's another solution for the latest dplyr version:

df <- data.frame(this = c(1, 2, 2),
                 that = c(1, 1, 2))
column <- "this"

df %>% filter(.[[column]] == 1)

#  this that
#1    1    1

Solution 3:[3]

Regarding Richard's solution, just want to add that if you the column is character. You can add shQuote to filter by character values.

For example, you can use

df %>% filter_(paste(column, "==", shQuote("a")))

If you have multiple filters, you can specify collapse = "&" in paste.

df %>$ filter_(paste(c("column1","column2"), "==", shQuote(c("a","b")), collapse = "&"))

Solution 4:[4]

The latest way to do this is to use my.data.frame %>% filter(.data[[myName]] == 1), where myName is an environmental variable that contains the column name.

Solution 5:[5]

Or using filter_at

library(dplyr)
df %>% 
   filter_at(vars(column), any_vars(. == 1))

Solution 6:[6]

Like Salim B explained above but with a minor change:

df %>% filter(1 == !!as.name(column))

i.e. just reverse the condition because !! otherwise behaves like

!!(as.name(column)==1)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Community
Solution 2 paul_dg
Solution 3 StatCC
Solution 4 Phoenix Mu
Solution 5 akrun
Solution 6 carand