'Convert a python dictionary with sets' values to a binary dataframe
I have a dictionary where the values are sets:
my_dict = {1: {'a', 'b'}, 2: {'a', 'c'}, 3: {'b', 'c', 'd'}, 4: {'a'}}
I would like to convert it to a binary dataframe where the columns are the members of the keys' sets - so for the above example, the output is as follows:
a b c d
1 1 1 0 0
2 1 0 1 0
3 0 1 1 1
4 1 0 0 0
How can I do it in an efficient and scalable manner?
Solution 1:[1]
You can use pd.str.get_dummies, like this:
my_dict = {1: {'a', 'b'}, 2: {'a', 'c'}, 3: {'b', 'c', 'd'}, 4: {'a'}}
ser = pd.Series({k: list(v) for k, v in my_dict.items()}).str.join('|').str.get_dummies()
print(ser)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Adam.Er8 |
