'Moving aggregation based on conditions (dataframe)

I have the following data:

Material	Plant	Date	count	backwards cumulative of count	total
ACID	1800	2021-07-01	1	3	100
ACID	1800	2021-09-01	1	2	200
ACID	1800	2021-10-01	1	1	300
ACID	1820	2021-09-01	2	9	400
ACID	1820	2021-10-01	2	7	500
ACID	1820	2021-11-01	2	5	200
ACID	1820	2021-12-01	3	3	100

I need to get the sum total value for each Material and Plant based on the condition that the cumulative should be > 1, and that the value we get is the most recent date that adheres to this condition.

This is the output I should get:

Material	Plant	date	total
ACID	1800	2021-09-01	500
ACID	1820	2021-12-01	100

The first row is the sum of dates 2021-09-01 and 2021-10-01.

I can get the rows where the cumulative count is above 1, and I know I have to use a groupby and max function in between, but I'm just not sure how.

select_indices = list(np.where(df2["backwards cumulative of count"] > 1)[0])
df2.iloc[select_indices]

Another way to do it is simply removing the irrelevant rows, so we would end up with:

Material	Plant	Date	count	backwards cumulative of count	total
ACID	1800	2021-09-01	1	2	200
ACID	1800	2021-10-01	1	1	300
ACID	1820	2021-12-01	3	3	100

and then do the aggregation.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Moving aggregation based on conditions (dataframe)

Sources

Related Questions