'Filter table on aggregate value and multi-index

I have the following data set:

df.head(7)
     Origin        Destination     Date            Quantity
0   Atlanta        LA       2021-09-09      1
1   Atlanta        LA       2021-09-11      4
2   Atlanta        Chicago  2021-09-16      1
3   Atlanta        Seattle  2021-09-27      12
4   Seattle        LA       2021-09-29      2
5   Seattle        Atlanta  2021-09-13      2
6   Seattle        Newark   2021-09-17      7

This table represents the number of items (Quantity) that were sent from a given origin to a given destination on a given date. The table contains 1 month of data. This table was read with:

shipments = pd.read_csv('shipments.csv', parse_dates=['Date'])

Using the shipment data, I can create a new aggregated table that shows me the total quantity shipped between every Origin and Dest pair during this month:

shipments_agg =raw_shipments.groupby(['Origin','Destination']).sum()

As a last step, I'd like to create a new table based on the shipments table, where a row (Origin, Destination, Date, Quantity) is only included if the aggregate Quantity for the (Origin,Destination) pair is larger than 50. In other words, a row (Origin, Destination, Date, Quantity) should only be included if (Origin,Destination) in shipments_agg has a Quantity larger than 50. I'm not quite sure how to accomplish this.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Filter table on aggregate value and multi-index

Sources

Related Questions