'Convert variables in data.table to formula

I have a sample data.table data as below:

   VarName Formulae
1:       A      1+1
2:       B      A+3
3:       C     B*10
4:       D      A+C
5:       E      D/2

I want to convert the Formulae column into formulas, so that the output can become like:

  VarName Result
1:       A      2
2:       B      5
3:       C      50
4:       D      52
5:       E      26

Basically the VarName column is the variable name and the Formulae column is the corresponding formula.

A = 1+1
B = A+3
C = B*10
D = A+C
E = D/2

I have tried using the eval and parse functions like data$VarName = eval(parse(text = "data$Formulae")), however I could not get the desired output.



Solution 1:[1]

Loop through VarName replace them with Formulae within brackets, then evaluate:

res <- setNames(x$Formulae, x$VarName)

while(any(grepl(paste0(names(res), collapse = "|"), res))) {
  for(i in names(res)){
    res <- gsub(i, paste0("(", res[ i ], ")"), res, fixed = TRUE)
  }
}

#res, after replacements:
#                          A                          B 
#                      "1+1"                  "(1+1)+3" 
#                          C                          D 
#             "((1+1)+3)*10"     "(1+1)+(((1+1)+3)*10)" 
#                          E 
# "((1+1)+(((1+1)+3)*10))/2" 

# evaluate
sapply(res, function(i) eval(parse(text = i)))
#A  B  C  D  E 
#2  5 50 52 26 

Solution 2:[2]

It's interesting to see another function for this task that can be useful in more complex (where the order of evaluation is not specified) cases -- delayedAssign assings a value to a name and only evaluates it when requested. This way, each object is evaluated sequentially until its value is reached. For example, consider the following "data.frame":

d = structure(list(v = c("a", "b", "A", "B", "C", "D", "E"), 
                   f = c("C+b", "A+B/D", "1+1", "A+3", "B*10", "A+C", "D/2")), 
              class = "data.frame", row.names = c(NA, -7L))

Then we setup a new environment (to avoid cluttering .GlobalEnv) and assign our variables:

e = new.env()
forms = parse(text = d$f)
for(i in 1:nrow(d)) do.call(delayedAssign, list(d$v[i], forms[[i]], e, e))

And evaluate:

unlist(mget(ls(e), e)) #or
unlist(eapply(e, eval))
#        A         B         C         D         a         E         b 
# 2.000000  5.000000 50.000000 52.000000 52.096154 26.000000  2.096154 

Solution 3:[3]

Using apply :

df <- data.frame("VarName"=c("X","Y"),"Formulae"=c("1+1","X+1"))
df$formulas <- apply(df,1,function(x)eval(parse(text = paste0(x["VarName"]," ~ ",x["Formulae"]))))

Using eval(parse(...)) structure was correct, but this should work properly. However, maybe someone will answer a cleaner proposition.

Take note that column "formulas" can not be a vector, so it is a list.

str(df)
'data.frame':   2 obs. of  3 variables:
 $ VarName : chr  "X" "Y"
 $ Formulae: chr  "1+1" "X+1"
 $ formulas:List of 2
  ..$ :Class 'formula'  language X ~ 1 + 1
  .. .. ..- attr(*, ".Environment")=<environment: 0x000002933f8904a8> 
  ..$ :Class 'formula'  language Y ~ X + 1
  .. .. ..- attr(*, ".Environment")=<environment: 0x000002933fb6f3b8> 

This can cause some headaches in dataframes usage. I suggest using mapping tools like purrr instead of concatening everything into a dataframe in this case.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 alexis_laz
Solution 3