'Exploding multiple dict columns and concatenating with original Pandas data frame
I export a Postgres SQL query to create a Pandas data frame df
similar to the following:
df = pd.DataFrame({
'employee_id' : [123, 456, 789],
'country_code' : ['US', 'CAN', 'MEX'],
'sales' : [{'foo': 2, 'bar': 0, 'baz': 1},
{'foo': 3, 'bar': 1, 'baz': 2},
{'foo': 7, 'bar': 0, 'baz': 4}],
'expenses' : [{'red': 1, 'white': 0, 'blue': 3},
{'red': 1, 'white': 0, 'blue': 1},
{'red': 2, 'white': 2, 'blue': 2}]
})
df
employee_id country_code sales expenses
0 123 US {'foo': 2, 'bar': 0, 'baz': 1} {'red': 1, 'white': 0, 'blue': 3}
1 456 CAN {'foo': 3, 'bar': 1, 'baz': 2} {'red': 1, 'white': 0, 'blue': 1}
2 789 MEX {'foo': 7, 'bar': 0, 'baz': 4} {'red': 2, 'white': 2, 'blue': 2}
I would like to be able to explode both the sales
and expenses
columns so that their keys are separate columns. Currently, I'm only able to explode one of these columns, as follows:
df = pd.json_normalize(df['sales'])
df
foo bar baz
0 2 0 1
1 3 1 2
2 7 0 4
I'm not able to pass a list of columns to pd.json.normalize()
.
Questions:
- How do I explode both the
sales
andexpenses
columns? - After exploding both columns, how do I add back the other two columns (
employee_id
andcountry_code
) from the original data frame?
The desired output is:
employee_id country_code foo bar baz red white blue
0 123 US 2 0 1 1 0 3
1 456 CAN 3 1 2 1 0 1
2 789 MEX 7 0 4 2 2 2
Thank you!
Solution 1:[1]
You may use concat
along axis=1 with json_normalize
:
json_cols = ['sales','expenses']
result = pd.concat([pd.json_normalize(df[col]) for col in json_cols],axis=1)
result = pd.concat([df.drop(json_cols,axis=1),result],axis=1)
Output:
result
employee_id country_code foo bar baz red white blue
0 123 US 2 0 1 1 0 3
1 456 CAN 3 1 2 1 0 1
2 789 MEX 7 0 4 2 2 2
Solution 2:[2]
If you want to modify the original dataframe, a variant would be:
cols = ['sales', 'expenses']
df = pd.concat([df]+[pd.json_normalize(df.pop(c)) for c in cols], axis=1)
output:
employee_id country_code foo bar baz red white blue
0 123 US 2 0 1 1 0 3
1 456 CAN 3 1 2 1 0 1
2 789 MEX 7 0 4 2 2 2
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | SomeDude |
Solution 2 | mozway |