'Python Pandas: Rename columns after pivoting
I have a pandas dataframe that is pivoted. I do not know how to change column names so that I can continue working with the pivoted dataframe in a natural way. Below is a minimal working example.
df = pd.DataFrame({'foo': ['one', 'one', 'one', 'two', 'two',
'two'],
'bar': ['A', 'B', 'C', 'A', 'B', 'C'],
'baz': [1, 2, 3, 4, 5, 6],
'zoo': ['x', 'y', 'z', 'q', 'w', 't']})
After pivoting with
df.pivot(index='foo', columns='bar', values=['baz', 'zoo'])
the output is:
baz zoo
bar A B C A B C
foo
one 1 2 3 x y z
two 4 5 6 q w t
What would be the following step to do in order to obtain below output?
A_baz B_baz C_baz A_zoo B_zoo C_zoo
one 1 2 3 x y z
two 4 5 6 q w t
Thanks a lot!
Solution 1:[1]
Use f-strings with list comprehension:
#python 3.6+
df.columns = [f'{j}_{i}' for i, j in df.columns]
#lower python versions
#df.columns = ['{}_{}'.format(j, i) for i, j in df.columns]
print (df)
A_baz B_baz C_baz A_zoo B_zoo C_zoo
foo
one 1 2 3 x y z
two 4 5 6 q w t
Or DataFrame.swaplevel with map and join:
df = df.pivot(index='foo', columns='bar', values=['baz', 'zoo']).swaplevel(0,1,axis=1)
df.columns = df.columns.map('_'.join)
print (df)
A_baz B_baz C_baz A_zoo B_zoo C_zoo
foo
one 1 2 3 x y z
two 4 5 6 q w t
Solution 2:[2]
One option is with pivot_wider from pyjanitor, using the names_glue parameter to reshape the column names:
# pip install pyjanitor
import pandas as pd
import janitor
df.pivot_wider(
index = 'foo',
names_from = 'bar',
values_from = ['baz', 'zoo'],
names_glue = "{bar}_{_value}")
foo A_baz B_baz C_baz A_zoo B_zoo C_zoo
0 one 1 2 3 x y z
1 two 4 5 6 q w t
in the names_glue string template, _value serves as a placeholder for values from values_from. Anything in the {} brackets should either be from names_from or _value to represent values_from.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | jezrael |
| Solution 2 |
