'Pandas, get all possible value combinations of length k grouped by feature
I have a Pandas dataframe something like:
| Feature A | Feature B | Feature C |
|---|---|---|
| A1 | B1 | C1 |
| A2 | B2 | C2 |
Given k as input, i want all values combination grouped by feature of length k, for example for k = 2 I want:
[{A:A1, B:B1},
{A:A1, B:B2},
{A:A1, C:C1},
{A:A1, C:C2},
{A:A2, B:B1},
{A:A2, B:B2},
{A:A2, C:C1},
{A:A2, C:C2},
{B:B1, C:C1},
{B:B1, C:C2},
{B:B2, C:C1},
{B:B2, C:C2}]
How can I achieve that?
Solution 1:[1]
This is probably not that efficient but it works for small scale.
First, determine the unique combinations of k columns.
from itertools import combinations
k = 2
cols = list(combinations(df.columns, k))
Then use MultiIndex.from_product to get cartesian product of k columns.
result = []
for c in cols:
result += pd.MultiIndex.from_product([df[x] for x in c]).values.tolist()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Emma |
