'How to use dplyr or data.table to perform look-ahead calculations by groups of data subsets?
I'm interested in using dplyr (for ease) and/or data.table (for speed since actual data to be operated on has 3 million + rows) to perform "look ahead" calculations on a data set, by grouped subsets, and then flag those rows where those look ahead calculations return a value of 0. The image at the bottom better illustrates what I'm trying to do, where State_1 is a column added (by dplyr mutate(...) for example) to the original data dataframe. Any recommendations for how to do this?
Mastering look-ahead calculations will help tremendously, it's something I've always dealt with clumsily, in XLS.
Starting with the data dataframe, with generating code immediately beneath:
> data
ID Period Values_1 Values_2 State
1 1 1 5 5 X0
2 1 2 0 2 X1
3 1 3 0 0 X2
4 1 4 0 12 X1
5 2 1 1 2 X0
6 2 2 0 0 X2
7 2 3 0 0 X0
8 2 4 0 0 X0
9 3 1 0 0 X2
10 3 2 0 0 X1
11 3 3 0 0 X9
12 3 4 0 2 X3
13 4 1 1 4 X2
14 4 2 2 5 X1
15 4 3 3 6 X9
16 4 4 0 0 X3
data <-
data.frame(
ID = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4),
Period = c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4),
Values_1 = c(5, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 0),
Values_2 = c(5, 2, 0, 12, 2, 0, 0, 0, 0, 0, 0, 2, 4, 5, 6, 0),
State = c("X0","X1","X2","X1","X0","X2","X0","X0", "X2","X1","X9","X3", "X2","X1","X9","X3")
)
Here is an illustration of what I'm try to accomplish:
Solution 1:[1]
We may use a group by approach - grouped by 'ID', loop over the 'Values' column with if_all to return TRUE for rows having only 0 values, then create an index where the last position is not the last observation along with the created index and replace
library(dplyr)
data %>%
group_by(ID) %>%
mutate(ind1 = if_all(starts_with('Values'), ~ .x == 0),
ind2 = last(which(ind1))== n() & ind1,
State_1 = replace(State, ind2, 'XX'), ind1 = NULL, ind2 = NULL) %>%
ungroup
-output
# A tibble: 16 × 6
ID Period Values_1 Values_2 State State_1
<dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 1 1 5 5 X0 X0
2 1 2 0 2 X1 X1
3 1 3 0 0 X2 X2
4 1 4 0 12 X1 X1
5 2 1 1 2 X0 X0
6 2 2 0 0 X2 XX
7 2 3 0 0 X0 XX
8 2 4 0 0 X0 XX
9 3 1 0 0 X2 X2
10 3 2 0 0 X1 X1
11 3 3 0 0 X9 X9
12 3 4 0 2 X3 X3
13 4 1 1 4 X2 X2
14 4 2 2 5 X1 X1
15 4 3 3 6 X9 X9
16 4 4 0 0 X3 XX
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | akrun |

