'Pandas resample daily to weekly data
I want to divide the daily data into 5 groups. Each starts from a different day with a fixed frequency of 5 business days. It's something like all the Monday put together and all the Tuesday put together. I use the resample function.
df1 = df.resample('5B').first()
df2 = df.resample('5B', offset=1).first()
df3 = df.resample('5B', offset=2).first()
I was expecting that df1 starts from, let's say, 2000-01-03, df2 starts from 2000-01-04 and df3 starts from 2000-01-05. But the result shows that both df2 and df3 start from 2000-01-03. Is my understanding of offset wrong?
Solution 1:[1]
I'm assuming a DataFrame with the date as index and datetime type. For instance df = pd.DataFrame({'col': range(32)}, index=pd.date_range('2000-01-03', '2000-02-03'))
If you want to split your data by weekday, use dt.weekday (0->Monday to 6->Sunday) and groupby in a dictionary comprehension (or a loop for saving to file):
dfs = {f'df{i+1}': d
for i,d in df.groupby(df.index.weekday)
if i<6}
Example output:
{'df1': col
2000-01-03 0
2000-01-10 7
2000-01-17 14
2000-01-24 21
2000-01-31 28,
'df2': col
2000-01-04 1
2000-01-11 8
2000-01-18 15
2000-01-25 22
2000-02-01 29,
'df3': col
2000-01-05 2
2000-01-12 9
2000-01-19 16
2000-01-26 23
2000-02-02 30,
'df4': col
2000-01-06 3
2000-01-13 10
2000-01-20 17
2000-01-27 24
2000-02-03 31,
'df5': col
2000-01-07 4
2000-01-14 11
2000-01-21 18
2000-01-28 25,
'df6': col
2000-01-08 5
2000-01-15 12
2000-01-22 19
2000-01-29 26}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | mozway |
