'Problem with doing calculation using groupby in pandas but it links to other group

Hi i have a dataframe which i groupby the quarter of the hour (q1 to q4) and then the hour of the day (hour 1 -24) I then want to perform a calculation based on each hour of each q and create a column call "reg_rate" which is based on an existing column "reg_sta" and is essentially 1 or -1.

The calculation is to calculate the number of 1 presented in the past 7 days for each group i wanted, i.e. each hour and each q. The result for each group i would expect the first 7 days of data would be NA due to the fact that we need 7 days worth of data point before doing the calculation.

here is my code

da = df.groupby(["ValueMinuteCet",'ValueHourCet'])

cond = df['Reg_sta'] > 0
df['reg_up'] = (cond.shift().rolling(7).sum())/7

What i found is that the first group of the data in "reg_up" is ok, then into the second group i would end up having the calculation linked to the last 7 days of "reg_sta" meaning i would have the data point straight in the first row of my second group. i.e. calculation is linked to previous group.

Anyone knows how to perform this calculation and separating it from previous group?

I just tried the following where i got an error with '>' not supported between instances of 'SeriesGroupBy' and 'int'

da = df.groupby(["ValueMinuteCet",'ValueHourCet'])

cond = da['Reg_sta'] > 0
df['reg_up'] = (cond.shift().rolling(7).sum())/7


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source