'Pandas 5 Day Rolling Count Unique Days
I have a table of sales in which I need to have a rolling count of days in which a buyer bought something for the past 5 days.
The sales data looks like this:
| DATE | BUYER | AMOUNT BOUGHT |
|---|---|---|
| 1/1/2022 | a | 2 |
| 1/3/2022 | a | 4 |
| 1/20/2022 | a | 3 |
| 1/4/2022 | b | 6 |
| 1/21/2022 | b | 10 |
The table should look back 5 days based on the date column and count all of the days in which the given buyer made a purchase. The amount of rows within the 5 day window is undefined, so I can't count the rows between.
The table should look like:
| DATE | BUYER | AMOUNT BOUGHT | 5-DAY ROLLING DAYS WITH PURCHASES |
|---|---|---|---|
| 1/1/2022 | a | 2 | 1 |
| 1/3/2022 | a | 4 | 2 |
| 1/20/2022 | a | 3 | 1 |
| 1/4/2022 | b | 6 | 1 |
| 1/21/2022 | b | 10 | 1 |
I've been trying to use nunique on the DATE column, but haven't had success:
What I have tried:
df_five_days['5-DAY ROLLING DAYS WITH PURCHASES'] = df.groupby(['BUYER']).rolling('5D', min_periods=1, on='DATE').DATE.nunique()
The goal is to capture buyers, on any given day, who have bought in the past 5 days, more than 3 days.
Solution 1:[1]
See if this will suit you. I did it like this:
df['DATE'] = pd.to_datetime(df['DATE'])
df['5-DAY ROLLING DAYS WITH PURCHASES'] = (df.groupby(['BUYER']).rolling('5D', min_periods=1, on='DATE').count()).reset_index().iloc[:, 3]
Output
DATE BUYER AMOUNT BOUGHT 5-DAY ROLLING DAYS WITH PURCHASES
0 2022-01-01 a 2 1.0
1 2022-01-03 a 4 2.0
2 2022-01-20 a 3 1.0
3 2022-01-04 b 6 1.0
4 2022-01-21 b 10 1.0
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | inquirer |
