'Count within a 30 days window

I want to keep an observation (grouped by ID) for every 30 days. I want to do this by creating a variable that tells me which observations are left inside (1) and which ones are outside (0) of the filter.

Example

id   date
1   3/1/2021
1   4/1/2021
1   5/1/2021
1   6/1/2021
1   2/2/2021
1   3/2/2021
1   5/2/2021
1   7/2/2021
1   9/2/2021
1    11/2/2021
1    13/2/2021
1    16/3/2021
2   5/1/2021
2   31/10/2021
2   9/1/2021
2   6/2/2021
2   1/6/2021
3   1/1/2021
3   1/6/2021
3   31/12/2021
4   5/5/2021

Expected result

id   date   count
1   3/1/2021    1
1   4/1/2021    0
1   5/1/2021    0
1   6/1/2021    0
1   2/2/2021    0
1   3/2/2021    1
1   5/2/2021    0
1   7/2/2021    0
1   9/2/2021    0
1    11/2/2021  0
1    13/2/2021  0
1    16/3/2021  1
2   5/1/2021    1
2   31/10/2021  1
2   9/1/2021    0
2   6/2/2021    1
2   1/6/2021    1
3   1/1/2021    1
3   1/6/2021    1
3   31/12/2021  1
4   5/5/2021    1




Solution 1:[1]

here is a data.table approach

library(data.table)
# sort by id by date
setkey(DT, id, date)
# create groups
DT[, group := rleid((as.numeric(date - date[1])) %/% 30), by = .(id)][]
#  create count column
DT[, count := ifelse(!group == shift(group, type = "lag", fill = 0), 1, 0), by = .(id)][]
#    id       date group count
# 1:  1 2021-01-03     1     1
# 2:  1 2021-01-04     1     0
# 3:  1 2021-01-05     1     0
# 4:  1 2021-01-06     1     0
# 5:  1 2021-02-02     2     1
# 6:  1 2021-02-03     2     0
# 7:  1 2021-02-05     2     0
# 8:  1 2021-02-07     2     0
# 9:  1 2021-02-09     2     0
#10:  1 2021-02-11     2     0
#11:  1 2021-02-13     2     0
#12:  1 2021-03-16     3     1
#13:  2 2021-01-05     1     1
#14:  2 2021-01-09     1     0
#15:  2 2021-02-06     2     1
#16:  2 2021-06-01     3     1
#17:  2 2021-10-31     4     1
#18:  3 2021-01-01     1     1
#19:  3 2021-06-01     2     1
#20:  3 2021-12-31     3     1
#21:  4 2021-05-05     1     1
#    id       date group count

sample data used

DT <- fread("id   date
1   3/1/2021
1   4/1/2021
1   5/1/2021
1   6/1/2021
1   2/2/2021
1   3/2/2021
1   5/2/2021
1   7/2/2021
1   9/2/2021
1    11/2/2021
1    13/2/2021
1    16/3/2021
2   5/1/2021
2   31/10/2021
2   9/1/2021
2   6/2/2021
2   1/6/2021
3   1/1/2021
3   1/6/2021
3   31/12/2021
4   5/5/2021")
# set date as actual date
DT[, date := as.Date(date, "%d/%m/%Y")]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1