'Pandas: Group by and conditional sum based on value of current row
My dataframe looks like this:
| customer_nr | order_value | year_ordered | payment_successful |
|---|---|---|---|
| 1 | 50 | 1980 | 1 |
| 1 | 75 | 2017 | 0 |
| 1 | 10 | 2020 | 1 |
| 2 | 55 | 2000 | 1 |
| 2 | 300 | 2007 | 1 |
| 2 | 15 | 2010 | 0 |
I want to know the total amount a customer has successfully paid in the years before, for a specific order.
The expected output is as follows:
| customer_nr | order_value | year_ordered | payment_successful | total_successfully_previously_paid |
|---|---|---|---|---|
| 1 | 50 | 1980 | 1 | 0 |
| 1 | 75 | 2017 | 0 | 50 |
| 1 | 10 | 2020 | 1 | 50 |
| 2 | 55 | 2000 | 1 | 0 |
| 2 | 300 | 2007 | 1 | 55 |
| 2 | 15 | 2010 | 0 | 355 |
Closest i've gotten is this:
df.groupby(['customer_nr', 'payment_successful'], as_index=False)['order_value'].sum()
That just gives me the summed amount successfully and unsuccessfully paid all time per customer. It doesn't account for selecting only previous orders to participate in the sum.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
