'How can I write this piece of code in Pandas 1.4.1?

I have the following DataFrame:

Date Distance Position TrainerID
2017-09-03 1000 2 6529
2017-09-03 1600 4 6529
2017-09-03 1200 3 6529
2017-09-06 1200 13 6529
2017-09-08 1000 1 6529
2017-09-10 1600 9 6529
2017-09-15 1600 2 6529

I want to compute on every row the winning percentage so far for the sprint races (distance of 1200 meters or less) in the last 1000 days, grouped by TrainerID. The result will be stored in a Win% Column. Dates need not to be unique. However, the winning % is considered to be before the race happened, so the current row is excluded. Thus, the results are delayed by one row.

The rows of the races that do not fit this category, should have the winning percentage from above.

I have the solution, which is the following code:

df = df.sort_values(['TrainerID', 'Date'], kind='stable')
no_days = '1000D'
mask = (df.Distance == 1000) | (df.Distance == 1200)
df = df.assign(Position_for_calc=df.loc[mask, 'Position'])
df = df.set_index('Date')
calc = lambda s: round(100*s.eq(1).sum()/s.notnull().sum())
ser_win = df.groupby('TrainerID').rolling(no_days)['Position_for_calc'].apply(calc).groupby('TrainerID').shift().fillna(0).values
df = df.assign(Win=ser_win).drop('Position_for_calc', axis=1)

In Pandas 1.3.5, it works perfectly. However, in Pandas 1.4.1 (the latest version that is at the time of writing this) it doesn't. It gives me an error message, stating that the Date must be monotonic.

Is there any way I can write the above code for the newest version of Pandas?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source