'How to generate all possible column combinations in 4 different pandas dataframes(WITHOUT any library)

I have four dataframes

1 - list of all soups (all other numbers are just detailed information about the dish)

          meal category  calories  protein   fat  carbs  amount  price
0    bean soup     soup        41     2.23  2.75   2.23     350    0.7
1  tomato soup     soup        45     0.68  1.53   7.70     350    0.7.7

2 - list of all main dishes

                   meal   category  calories  ...  carbs  amount  price
0  baked chicken thighs  main dish       129  ...   1.86     100    2.6
1         fried chicken  main dish       369  ...  28.70     180    2.6
2     fried cauliflower  main dish       256  ...  24.10     170    2.8

3 - list of all side dishes

                meal  category  calories  protein   fat  carbs  amount  price
0              pasta  sidedish       135     3.50  2.50  22.80     225   0.7
1  american potatoes  sidedish       143     2.55  4.09  24.02     220   1.3
2              fries  sidedish       143     2.55  4.09  24.02     200   1.4

4 - list of all desserts

        meal category  calories  protein   fat  carbs  amount  price
0  tangerine  dessert        39     0.72  0.30    7.7      85   0.25
1      apple  dessert        49     0.37  0.40    9.9     130   0.20
2     banana  dessert        90     1.20  0.24   19.8     120   0.25

There may be a different number of dishes.

I need to combine all possible lunch options (one combination of one soup, one main dish, one side dish and one dessert)

Output should be list of tuples of 4 pandas Series

Combination_list = [(SoupS,MainS,SideS,DessertS),(SoupS1,MainS1,SideS1,DessertS1)...]

Where(for example)

SoupS =

0    bean soup
1         soup
2           41
3         2.23
4         2.75
5         2.23
6          350
7          0.7

How can I create my own function without using other libraries?(only pandas)



Solution 1:[1]

Here is a way to get a list of DataFrames, each with a complete meal, using the standard library itertools.product (which is built in Python, not an external package):

from itertools import product

# say the four original DataFrames are a, b, c, and d
out = list(map(
    pd.DataFrame,
    product(*map(lambda df: df.to_dict('records'), [a, b, c, d]))
))

Now, out contains 54 distinct meal permutations. For example:

>>> out[0]
  meal                  category    calories  protein  fat   carbs  amount  price
0             bean soup       soup   41       2.23     2.75   2.23  350     0.70 
1  baked chicken thighs  main dish  129        NaN      NaN   1.86  100     2.60 
2                 pasta   sidedish  135       3.50     2.50  22.80  225     0.70 
3             tangerine    dessert   39       0.72     0.30   7.70   85     0.25 

>>> out[20]
  meal               category    calories  protein  fat   carbs  amount  price
0          bean soup       soup   41       2.23     2.75   2.23  350     0.70 
1  fried cauliflower  main dish  256        NaN      NaN  24.10  170     2.80 
2              pasta   sidedish  135       3.50     2.50  22.80  225     0.70 
3             banana    dessert   90       1.20     0.24  19.80  120     0.25 

Note: the NaN come from the fact that I copied your sample data, and it is incomplete (I can't make up what's in the '...' columns).

To reproduce the above:

# reproducible setup

a = pd.DataFrame(
    [['bean soup', 'soup', 41, 2.23, 2.75, 2.23, 350, 0.7],
     ['tomato soup', 'soup', 45, 0.68, 1.53, 7.7, 350, 0.7]],
    columns=['meal', 'category', 'calories', 'protein', 'fat', 'carbs', 'amount', 'price'])

b = pd.DataFrame(
    [['baked chicken thighs', 'main dish', 129, 1.86, 100, 2.6],
     ['fried chicken', 'main dish', 369, 28.7, 180, 2.6],
     ['fried cauliflower', 'main dish', 256, 24.1, 170, 2.8]],
    columns=['meal', 'category', 'calories', 'carbs', 'amount', 'price'])

c = pd.DataFrame(
    [['pasta', 'sidedish', 135, 3.5, 2.5, 22.8, 225, 0.7],
     ['american potatoes', 'sidedish', 143, 2.55, 4.09, 24.02, 220, 1.3],
     ['fries', 'sidedish', 143, 2.55, 4.09, 24.02, 200, 1.4]],
    columns=['meal', 'category', 'calories', 'protein', 'fat', 'carbs', 'amount', 'price'])

d = pd.DataFrame(
    [['tangerine', 'dessert', 39, 0.72, 0.3, 7.7, 85, 0.25],
     ['apple', 'dessert', 49, 0.37, 0.4, 9.9, 130, 0.2],
     ['banana', 'dessert', 90, 1.2, 0.24, 19.8, 120, 0.25]],
    columns=['meal', 'category', 'calories', 'protein', 'fat', 'carbs', 'amount', 'price'])

Solution 2:[2]

you could loop over everything. It will be very inefficient.

soup_list = ['s1','s2']
main_list = ['m1']
side_list = ['ss1','ss1']
dessert_list = ['d1','d2','d3']

combination_list = []

for soup in soup_list:
 for main in main_list:
  for side in side_list:
   for dessert in dessert_list:
    combination_list.append([soup,main,side,dessert])
    
print(combination_list)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Ritwick Jha