'Pandas resample daily to weekly data

I want to divide the daily data into 5 groups. Each starts from a different day with a fixed frequency of 5 business days. It's something like all the Monday put together and all the Tuesday put together. I use the resample function.

df1 = df.resample('5B').first()
df2 = df.resample('5B', offset=1).first()
df3 = df.resample('5B', offset=2).first()

I was expecting that df1 starts from, let's say, 2000-01-03, df2 starts from 2000-01-04 and df3 starts from 2000-01-05. But the result shows that both df2 and df3 start from 2000-01-03. Is my understanding of offset wrong?



Solution 1:[1]

I'm assuming a DataFrame with the date as index and datetime type. For instance df = pd.DataFrame({'col': range(32)}, index=pd.date_range('2000-01-03', '2000-02-03'))

If you want to split your data by weekday, use dt.weekday (0->Monday to 6->Sunday) and groupby in a dictionary comprehension (or a loop for saving to file):

dfs = {f'df{i+1}': d
       for i,d in df.groupby(df.index.weekday)
       if i<6}

Example output:

{'df1':             col
 2000-01-03    0
 2000-01-10    7
 2000-01-17   14
 2000-01-24   21
 2000-01-31   28,
 'df2':             col
 2000-01-04    1
 2000-01-11    8
 2000-01-18   15
 2000-01-25   22
 2000-02-01   29,
 'df3':             col
 2000-01-05    2
 2000-01-12    9
 2000-01-19   16
 2000-01-26   23
 2000-02-02   30,
 'df4':             col
 2000-01-06    3
 2000-01-13   10
 2000-01-20   17
 2000-01-27   24
 2000-02-03   31,
 'df5':             col
2000-01-07    4
 2000-01-14   11
 2000-01-21   18
 2000-01-28   25,
 'df6':             col
 2000-01-08    5
 2000-01-15   12
 2000-01-22   19
 2000-01-29   26}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 mozway