'Get count of combinations within hierarchal data Python Pandas DataFrame

I have a data set of orders with the item ordered, the quantity ordered, and the box it was shipped in. I'd like to find the possible order combinations of [Box Type, Item, Quantity] and assign each order an identifier for its combination for further analysis. Ideally, the output would look like this:

d2 = {'Order Number': [1, 2, 3], 'Order Type': [1, 2, 1]}
pd.DataFrame(d2)

Where grouping by 'Order Type' would provide a count of the unique order types.

The problem is that each box is assigned a unique code necessary to distinguish whether a box held multiple items. In the example data below "box_id" = 3 shows that the second "Box A" contains two items, 3 and 4. While this field is needed to

import pandas as pd

d = {'Order Number': [1, 2, 2, 2, 3], 'Box_id': [1, 2, 3, 3, 4], 'Box Type': ['Box A', 'Box B', 'Box A', 'Box A', 'Box A'], 
     'Item': ['A1', 'A2', 'A2', 'A3', 'A1'], 'Quantity': [2, 4, 2, 2, 2]}

pd.DataFrame(d)

I have tried representing each order as a tuple of its [Box type, Item, Quantity] data and using those tuples to capture counts with a default dictionary, but that output is understandably messy to interpret and difficult to match with orders afterwards.

from collections import defaultdict
combinations =  defaultdict(int)

Order1 = ((('Box A', 'A1', 2),),)
Order2 = (('Box B', 'A2', 4), (('Box A', 'A2', 2),('Box A', 'A3', 2)))
Order3 = ((('Box A', 'A1', 2),),)

combinations[Order1] += 1
combinations[Order2] += 1
combinations[Order3] += 1

# Should result in
combinations = {((('Box A', 'A1', 2),),): 2
(('Box B', 'A2', 4), (('Box A', 'A2', 2),('Box A', 'A3', 2))): 1}

Is there an easier way to get a representation of unique order combinations and their counts?

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Get count of combinations within hierarchal data Python Pandas DataFrame

Sources

Related Questions