'Progress bar in data.table aggregate action
ddply has a .progress to get a progress bar while it's running, is there an equivalent for data.table in R?
Solution 1:[1]
Following up on @jangorecki's excellent answer, here's a way to use a text progress bar:
library(data.table)
dt = data.table(a=1:4, b=c("a","b"))
grpn = uniqueN(dt$b)
pb <- txtProgressBar(min = 0, max = grpn, style = 3)
dt[, {setTxtProgressBar(pb, .GRP); Sys.sleep(0.5); sum(a)}, b]
close(pb)
Solution 2:[2]
Following up again on @jangorecki's great answer.
If you don't want to spam your terminal too much, you can make an external function equivalent to jangorecki's, but which does a modulus check and only prints if .GRP is divisible by a certain number "mod". Note, using the if function within the data.table curly-brackets itself doesn't work, which I assume is because if function's in R also use curly brackets.
progress = function(.GRP, grpn, mod) {
if(!(.GRP %% mod)) {
cat("progress", .GRP/grpn*100,"%\n")
}
}
Then do. Here I use mod = 1000, so it would only print the percentage every 1000 groups.
dt[, {progress(.GRP, grpn, 1000); sum(a)}, b]
Solution 3:[3]
Following up on @jangorecki and other great answers, you can use the data.table symbol .NGRP instead of calculating grpn as in the other answers:
dt[, {cat("progress",.GRP/.NGRP*100,"%\n"); sum(a)}, b]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Zach |
| Solution 2 | Eliot Behr |
| Solution 3 | Eric Aya |
