'How to generate all possible column combinations in 4 different pandas dataframes(WITHOUT any library)
I have four dataframes
1 - list of all soups (all other numbers are just detailed information about the dish)
meal category calories protein fat carbs amount price
0 bean soup soup 41 2.23 2.75 2.23 350 0.7
1 tomato soup soup 45 0.68 1.53 7.70 350 0.7.7
2 - list of all main dishes
meal category calories ... carbs amount price
0 baked chicken thighs main dish 129 ... 1.86 100 2.6
1 fried chicken main dish 369 ... 28.70 180 2.6
2 fried cauliflower main dish 256 ... 24.10 170 2.8
3 - list of all side dishes
meal category calories protein fat carbs amount price
0 pasta sidedish 135 3.50 2.50 22.80 225 0.7
1 american potatoes sidedish 143 2.55 4.09 24.02 220 1.3
2 fries sidedish 143 2.55 4.09 24.02 200 1.4
4 - list of all desserts
meal category calories protein fat carbs amount price
0 tangerine dessert 39 0.72 0.30 7.7 85 0.25
1 apple dessert 49 0.37 0.40 9.9 130 0.20
2 banana dessert 90 1.20 0.24 19.8 120 0.25
There may be a different number of dishes.
I need to combine all possible lunch options (one combination of one soup, one main dish, one side dish and one dessert)
Output should be list of tuples of 4 pandas Series
Combination_list = [(SoupS,MainS,SideS,DessertS),(SoupS1,MainS1,SideS1,DessertS1)...]
Where(for example)
SoupS =
0 bean soup
1 soup
2 41
3 2.23
4 2.75
5 2.23
6 350
7 0.7
How can I create my own function without using other libraries?(only pandas)
Solution 1:[1]
Here is a way to get a list of DataFrames, each with a complete meal, using the standard library itertools.product (which is built in Python, not an external package):
from itertools import product
# say the four original DataFrames are a, b, c, and d
out = list(map(
pd.DataFrame,
product(*map(lambda df: df.to_dict('records'), [a, b, c, d]))
))
Now, out contains 54 distinct meal permutations. For example:
>>> out[0]
meal category calories protein fat carbs amount price
0 bean soup soup 41 2.23 2.75 2.23 350 0.70
1 baked chicken thighs main dish 129 NaN NaN 1.86 100 2.60
2 pasta sidedish 135 3.50 2.50 22.80 225 0.70
3 tangerine dessert 39 0.72 0.30 7.70 85 0.25
>>> out[20]
meal category calories protein fat carbs amount price
0 bean soup soup 41 2.23 2.75 2.23 350 0.70
1 fried cauliflower main dish 256 NaN NaN 24.10 170 2.80
2 pasta sidedish 135 3.50 2.50 22.80 225 0.70
3 banana dessert 90 1.20 0.24 19.80 120 0.25
Note: the NaN come from the fact that I copied your sample data, and it is incomplete (I can't make up what's in the '...' columns).
To reproduce the above:
# reproducible setup
a = pd.DataFrame(
[['bean soup', 'soup', 41, 2.23, 2.75, 2.23, 350, 0.7],
['tomato soup', 'soup', 45, 0.68, 1.53, 7.7, 350, 0.7]],
columns=['meal', 'category', 'calories', 'protein', 'fat', 'carbs', 'amount', 'price'])
b = pd.DataFrame(
[['baked chicken thighs', 'main dish', 129, 1.86, 100, 2.6],
['fried chicken', 'main dish', 369, 28.7, 180, 2.6],
['fried cauliflower', 'main dish', 256, 24.1, 170, 2.8]],
columns=['meal', 'category', 'calories', 'carbs', 'amount', 'price'])
c = pd.DataFrame(
[['pasta', 'sidedish', 135, 3.5, 2.5, 22.8, 225, 0.7],
['american potatoes', 'sidedish', 143, 2.55, 4.09, 24.02, 220, 1.3],
['fries', 'sidedish', 143, 2.55, 4.09, 24.02, 200, 1.4]],
columns=['meal', 'category', 'calories', 'protein', 'fat', 'carbs', 'amount', 'price'])
d = pd.DataFrame(
[['tangerine', 'dessert', 39, 0.72, 0.3, 7.7, 85, 0.25],
['apple', 'dessert', 49, 0.37, 0.4, 9.9, 130, 0.2],
['banana', 'dessert', 90, 1.2, 0.24, 19.8, 120, 0.25]],
columns=['meal', 'category', 'calories', 'protein', 'fat', 'carbs', 'amount', 'price'])
Solution 2:[2]
you could loop over everything. It will be very inefficient.
soup_list = ['s1','s2']
main_list = ['m1']
side_list = ['ss1','ss1']
dessert_list = ['d1','d2','d3']
combination_list = []
for soup in soup_list:
for main in main_list:
for side in side_list:
for dessert in dessert_list:
combination_list.append([soup,main,side,dessert])
print(combination_list)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Ritwick Jha |
