'Creating subsets on multiple features in python for segmentation

I want to segment a dataset containing items (labeled with IDs), and multiple categorical features that take different values (for instance, color takes 'blue', 'orange', 'green'; size takes 'S', 'M', 'L', brand takes 'Brand A', 'Brand B', etc.):

ID Brand Color Size Price
1 Brand 1 Orange S 23
2 Brand 2 Blue XXL 3
3 Brand 1 Green XXXL 45
4 Brand 2 Blue M 200

I can easily do it by hand for 1 or 2 features (with a small number of values). E.G. if I segment by brand I get:

ID Brand Color Size Price
1 Brand 1 Orange S 23
3 Brand 1 Green XXXL 45

and

ID Brand Color Size Price
2 Brand 2 Blue XXL 3
4 Brand 2 Blue M 200

Unfortunately, some features take 10+ values. Moreover, the number of subsets explodes if I want to segment according to more than 1 feature for segmentation. I am trying to test different levels of segmentation (e.g. color + brand, color+brand+size) which is why I don't do it by hand.

I am trying to figure out a function that take the dataframe and a list of features in input and that output all the different subsets but for now, my code is worthless.

Thank you in advance if you think you can help me!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source