'R - Data.table - Using variable column names in RHS operations

How do I use variable column names on the RHS of := operations? For example, given this data.table "dt", I'd like to create two new columns, "first_y" and "first_z" that contains the first observation of the given column for the values of "x".

dt <- data.table(x = c("one","one","two","two","three"), 
                 y = c("a", "b", "c", "d", "e"), 
                 z = c(1, 2, 3, 4, 5))

dt
       x y z
1:   one a 1
2:   one b 2
3:   two c 3
4:   two d 4
5: three e 5

Here's how you would do it without variable column names.

dt[, c("first_y", "first_z") := .(first(y), first(z)), by = x]

dt
       x y z first_y first_z
1:   one a 1       a       1
2:   one b 2       a       1
3:   two c 3       c       3
4:   two d 4       c       3
5: three e 5       e       5

But how would I do this if the "y" and "z" column names are dynamically stored in a variable?

cols <- c("y", "z")

# This doesn't work
dt[, (paste0("first_", cols)) := .(first(cols)), by = x]

# Nor does this
q <- quote(first(as.name(cols[1])))
p <- quote(first(as.name(cols[2])))
dt[, (paste0("first_", cols)) := .(eval(q), eval(p)), by = x]

I've tried numerous other combinations of quote() and eval() and as.name() without success. The LHS of the operation appears to be working as intended and is documented in many places, but I can't find anything about using a variable column name on the RHS. Thanks in advance.



Solution 1:[1]

I'm not familiar with the first function (although it looks like something Hadley would define).

dt[, paste0("first_", cols) := lapply(.SD, head, n = 1L), 
   by = x, .SDcols = cols]
#       x y z first_y first_z
#1:   one a 1       a       1
#2:   one b 2       a       1
#3:   two c 3       c       3
#4:   two d 4       c       3
#5: three e 5       e       5

Solution 2:[2]

The .SDcols answer is fine for this case, but you can also just use get:

dt[, paste0("first_", cols) := lapply(cols, function(x) get(x)[1]), by = x]
dt
#       x y z first_y first_z
#1:   one a 1       a       1
#2:   one b 2       a       1
#3:   two c 3       c       3
#4:   two d 4       c       3
#5: three e 5       e       5

Another alternative is the vectorized version - mget:

dt[, paste0("first_", cols) := setDT(mget(cols))[1], by = x]

Solution 3:[3]

I can never get "get" or "eval" to work on the RHS when trying to do mathematical operations. Try this if you need to.

Thing_dt[, c(new_col) := Thing_dt[[oldcol1]] * Thing_dt[[oldcol2]]]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3 Martin Gal