'Pandas rolling average of a columns of dates

I'm trying to calculate the rolling average of a column of datetime objects. In my scenario, the input data are the last day below freezing each year for ~100 years.

Ignoring the years, I want to take a rolling average over the month-day rows. The output would be the average last day below freezing over some window of years.

My final plot will be years on the x-axis and the rolling average of the last day below freezing on the y-axis.

Here is a minimal setup of rows I want to average over.

df = pd.Series(['1900-05-04','1901-05-03', ..., '1910-04-30'])
df = pd.to_datetime(df)

>>> df

87   1900-05-04
88   1901-05-03
89   1902-05-06
90   1903-05-01
91   1904-05-03
92   1905-04-29
93   1906-05-03
94   1907-05-03
95   1908-05-04
96   1909-04-30
97   1910-04-30
dtype: datetime64[ns]

Desired output (with or without the years) with a window size of two would be:

1   1901-05-04
2   1902-05-05
3   1903-05-04
4   1904-05-02
5   1905-05-01
6   1906-05-03
7   1907-05-04
8   1908-05-02
9   1909-04-30
10  1910-04-30


Solution 1:[1]

Here is a hack! It should be taken as inspiration and mostly like there exist a better one as it’s not optimal

pd.to_datetime(pd.DataFrame((df.loc[1:].dt.year, df.loc[1:].dt.month,
df.groupby(df.dt.month).transform(lambda d: d.dt.day.rolling(2).mean().round(0)).loc[1:].astype(int))).T.rename(columns={0:'year',1:'month',2:'day'}))

I group by month and took a rolling mean. I rebuilt the date by assembling the year, month and rolled day.

Results: enter image description here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Prayson W. Daniel