'Filter table on aggregate value and multi-index
I have the following data set:
df.head(7)
Origin Destination Date Quantity
0 Atlanta LA 2021-09-09 1
1 Atlanta LA 2021-09-11 4
2 Atlanta Chicago 2021-09-16 1
3 Atlanta Seattle 2021-09-27 12
4 Seattle LA 2021-09-29 2
5 Seattle Atlanta 2021-09-13 2
6 Seattle Newark 2021-09-17 7
This table represents the number of items (Quantity) that were sent from a given origin to a given destination on a given date. The table contains 1 month of data. This table was read with:
shipments = pd.read_csv('shipments.csv', parse_dates=['Date'])
Using the shipment data, I can create a new aggregated table that shows me the total quantity shipped between every Origin and Dest pair during this month:
shipments_agg =raw_shipments.groupby(['Origin','Destination']).sum()
As a last step, I'd like to create a new table based on the shipments table, where a row (Origin, Destination, Date, Quantity) is only included if the aggregate Quantity for the (Origin,Destination) pair is larger than 50. In other words, a row (Origin, Destination, Date, Quantity) should only be included if (Origin,Destination) in shipments_agg has a Quantity larger than 50. I'm not quite sure how to accomplish this.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
