'Pandas Pivot Table with multilevel index
I have a df with items and their yearly sales. I would like to change it to pivot table but with the two levels index.
My DF:
date brand_id brand_name art_id art_name count_art
2015 1 cat 10 A 120
2016 1 cat 10 A 100
2017 1 cat 12 B 80
2015 2 dog 20 C 100
2016 2 dog 25 D 110
2015 3 bird 30 E 50
2017 3 bird 31 F 90
I want the result to be like this:
2015 2016 2017
brand_id brand_name art_id art_name count_art art_id art_name count_art art_id art_name count_art
1 cat 10 A 120 10 A 100 12 B 80
2 dog 20 C 100 25 D 110 null null null
3 bird 30 E 50 null null null 31 F 90
For now I have tried following command:
transformed_data = df.pivot_table(values=['art_id', 'art_name', 'count_art'], index=['brand_id', 'brand_name'], columns='date', aggfunc='first')
However it isn't working as expected. I know how to change rows to yearly columns however I don't know how to change multiple columns in multiple rows to one row with more columns.
Solution 1:[1]
IIUC, use pivot_table command to include the values in the desired order. Then, use swaplevel to reorder your levels, and sort_index with sort_remaining=False to ensure only the dates are sorted:
new_cols = ['art_id', 'art_name', 'count_art']
transformed_data = (
df.pivot_table(values=new_cols,
index=['brand_id', 'brand_name'],
columns=['date'], aggfunc='first')
[new_cols]
.swaplevel(axis=1)
.sort_index(level=0, axis=1, sort_remaining=False)
)
output:
date 2015 2016 2017
art_id art_name count_art art_id art_name count_art art_id art_name count_art
brand_id brand_name
1 cat 10.0 A 120.0 10.0 A 100.0 12.0 B 80.0
2 dog 20.0 C 100.0 25.0 D 110.0 NaN NaN NaN
3 bird 30.0 E 50.0 NaN NaN NaN 31.0 F 90.0
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
