'Moving aggregation based on conditions (dataframe)

I have the following data:

Material Plant Date count backwards cumulative of count total
ACID 1800 2021-07-01 1 3 100
ACID 1800 2021-09-01 1 2 200
ACID 1800 2021-10-01 1 1 300
ACID 1820 2021-09-01 2 9 400
ACID 1820 2021-10-01 2 7 500
ACID 1820 2021-11-01 2 5 200
ACID 1820 2021-12-01 3 3 100

I need to get the sum total value for each Material and Plant based on the condition that the cumulative should be > 1, and that the value we get is the most recent date that adheres to this condition.

This is the output I should get:

Material Plant date total
ACID 1800 2021-09-01 500
ACID 1820 2021-12-01 100

The first row is the sum of dates 2021-09-01 and 2021-10-01.

I can get the rows where the cumulative count is above 1, and I know I have to use a groupby and max function in between, but I'm just not sure how.

select_indices = list(np.where(df2["backwards cumulative of count"] > 1)[0])
df2.iloc[select_indices]

Another way to do it is simply removing the irrelevant rows, so we would end up with:

Material Plant Date count backwards cumulative of count total
ACID 1800 2021-09-01 1 2 200
ACID 1800 2021-10-01 1 1 300
ACID 1820 2021-12-01 3 3 100

and then do the aggregation.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source