'How to melt groups of columns along with adding columns specifying the column name where the values come from?
I want to be able to convert a wide form dataframe to long form, but while doing that I also want to add columns that specify the column names where the values actually come from. Here is what I mean -
Original dataframe
pd.DataFrame({'A_1_H': list('abc'), 'B_1_H': list('123'), 'C_1_H': list('mno'), 'X_1_L': [9,8,7], 'Y_1_L':list('pqr'), 'Z_1_L':[1,2,3]})
A_1_H B_1_H C_1_H X_1_L Y_1_L Z_1_L
0 a 1 m 9 p 1
1 b 2 n 8 q 2
2 c 3 o 7 r 3
What I want -
Col1 Key1 Col2 Key2
0 a A_1_H 9 X_1_L
1 b A_1_H 8 X_1_L
2 c A_1_H 7 X_1_L
3 1 B_1_H p Y_1_L
4 2 B_1_H q Y_1_L
5 3 B_1_H r Y_1_L
6 m C_1_H 1 Z_1_L
7 n C_1_H 2 Z_1_L
8 o C_1_H 3 Z_1_L
I am grouping the _1_H
suffixed columns into one column Col1
and the corresponding column names in Key1
and similarly with the suffix _1_L
into Col2
with its keys in Key2
.
Is there a way to do it effectively?
Solution 1:[1]
you can melt
twice and concat
both, use filter
to select the columns _H and _L separately.
df_ = pd.concat([pd.melt(df.filter(regex='_H'), var_name='Key1', value_name='Col1'),
pd.melt(df.filter(regex='_L'), var_name='Key2', value_name='Col2')],
axis=1)
print (df_)
Key1 Col1 Key2 Col2
0 A_1_H a X_1_L 9
1 A_1_H b X_1_L 8
2 A_1_H c X_1_L 7
3 B_1_H 1 Y_1_L p
4 B_1_H 2 Y_1_L q
5 B_1_H 3 Y_1_L r
6 C_1_H m Z_1_L 1
7 C_1_H n Z_1_L 2
8 C_1_H o Z_1_L 3
Solution 2:[2]
One option is with pivot_longer from pyjanitor, by passing a list of regular expressions:
import pandas as pd
import janitor
( df
.pivot_longer(
index = None,
names_to = ('Key1', 'Key2'),
values_to = ('Col1', 'Col2'),
names_pattern = ['A|B|C', 'X|Y|Z'])
)
Key1 Col1 Key2 Col2
0 A_1_H a X_1_L 9
1 A_1_H b X_1_L 8
2 A_1_H c X_1_L 7
3 B_1_H 1 Y_1_L p
4 B_1_H 2 Y_1_L q
5 B_1_H 3 Y_1_L r
6 C_1_H m Z_1_L 1
7 C_1_H n Z_1_L 2
8 C_1_H o Z_1_L 3
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Ben.T |
Solution 2 |