'Precise 12 months rolling sum with pandas groupby

Is there a way using pandas to compute rolling sum over 12 months as opposed to 365 days? (which is suggested here) For the sake of simplicity I do it here just for one group, but the code should work with multiple groups. The true data spans about a century, so the one day errors add up.

Example

Note the switch from the end of month to beginning of month around the end of 2020.

data = {'date': ['2020-01-31', '2020-02-29', '2020-03-31',
                 '2020-04-30', '2020-05-31', '2020-06-30',
                 '2020-07-31', '2020-08-31', '2020-09-30',
                 '2020-10-31', '2020-11-30', '2020-12-31', 
                 '2021-01-01'],
        'values': [1, 1,1,1,1,1,1,1,1,1,1,1,1],
        'group': [1, 1,1,1,1,1,1,1,1,1,1,1,1]}
df = pd.DataFrame(data, columns=['date', 'values', 'group' ''])


df['date'] = pd.to_datetime(df['date'])
df = df.sort_values('date').set_index('date')
df.groupby('group').rolling("365d", min_periods=12).sum()[['values']]

The input

          date  values  group
0   2020-01-31       1      1
1   2020-02-29       1      1
2   2020-03-31       1      1
3   2020-04-30       1      1
4   2020-05-31       1      1
5   2020-06-30       1      1
6   2020-07-31       1      1
7   2020-08-31       1      1
8   2020-09-30       1      1
9   2020-10-31       1      1
10  2020-11-30       1      1
11  2020-12-31       1      1
12  2021-01-01       1      1

is transformed to

                  values
group date              
1     2020-01-31     NaN
      2020-02-29     NaN
      2020-03-31     NaN
      2020-04-30     NaN
      2020-05-31     NaN
      2020-06-30     NaN
      2020-07-31     NaN
      2020-08-31     NaN
      2020-09-30     NaN
      2020-10-31     NaN
      2020-11-30     NaN
      2020-12-31    12.0
      2021-01-01    13.0

The desired output is

                   values
group date              
1     2020-01-31     NaN
      2020-02-29     NaN
      2020-03-31     NaN
      2020-04-30     NaN
      2020-05-31     NaN
      2020-06-30     NaN
      2020-07-31     NaN
      2020-08-31     NaN
      2020-09-30     NaN
      2020-10-31     NaN
      2020-11-30     NaN
      2020-12-31    12.0
      2021-01-01    12.0

Edit

I also need to account for the case that certain months are missing so that the sum starts from scratch. Note that February 2020 is missing in the following example:

data = {'date': ['2020-01-31', '2020-03-31',
                 '2020-04-30', '2020-05-31', '2020-06-30',
                 '2020-07-31', '2020-08-31', '2020-09-30',
                 '2020-10-31', '2020-11-30', '2020-12-31', 
                 '2021-01-31', '2021-02-28', '2021-03-31' ],
        'values': [1, 1,1,1,1,1,1,1,1,1,1,1,1,  1],
        'group': [1, 1,1,1,1,1,1,1,1,1,1,1,1, 1]}
df = pd.DataFrame(data, columns=['date', 'values', 'group' ''])


df['date'] = pd.to_datetime(df['date'])
df = df.sort_values('date').set_index('date')
df.groupby('group').rolling(12, min_periods=12).sum()[['values']]

Output:

                  values
group date              
1     2020-01-31     NaN
      2020-03-31     NaN
      2020-04-30     NaN
      2020-05-31     NaN
      2020-06-30     NaN
      2020-07-31     NaN
      2020-08-31     NaN
      2020-09-30     NaN
      2020-10-31     NaN
      2020-11-30     NaN
      2020-12-31     NaN
      2021-01-31    12.0
      2021-02-28    12.0
      2021-03-31    12.0

Desired output:

                  values
group date              
1     2020-01-31     NaN
      2020-03-31     NaN
      2020-04-30     NaN
      2020-05-31     NaN
      2020-06-30     NaN
      2020-07-31     NaN
      2020-08-31     NaN
      2020-09-30     NaN
      2020-10-31     NaN
      2020-11-30     NaN
      2020-12-31     NaN
      2021-01-31     NaN
      2021-02-28     NaN
      2021-03-31    12.0


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source