'Changing python data-frame to pivot as per need
I have a dataframe like given below:
ID1 ID2 YrMonth Class
1 p1 Feb-19 PE5
1 p1 Feb-19 PE5
1 p1 Feb-19 PE5
1 p1 Feb-19 SC
1 p2 Feb-19 SC
1 p2 Feb-19 SC
1 p2 Feb-19 SC
1 p3 Feb-19 EA
1 p3 Feb-19 EA
1 p3 Feb-19 PE5
1 p4 Feb-19 EA
1 p4 Feb-19 PE5
1 p4 Feb-19 SC
I want to convert it into another dataframe or pivot such that in a given month for a particular ID2 if there is a transition in class it should be reflected in a row as given in output table. For ex - In ID2 for p1 class changes from PE5 to SC. In output I have represented as PE5->SC but it could other convenient representation also.
If there is not change in class for a particular ID2, class should come as it is as in second row of output table class is SC only.
For ID2 p3 there is transition in class from EA to PE5 so it is represented as EA->PE5.
For ID2 p4 there is transition in class from EA-PE5-SC so it is represented as EA->PE5->SC
Output pivot/dataframe
ID1 ID2 YrMonth Class
1 p1 Feb-19 PE5->SC
1 p2 Feb-19 SC
1 p3 Feb-19 EA->PE5
1 p4 Feb-19 EA->PE5->SC
Solution 1:[1]
Use DataFrame.drop_duplicates with aggregate join:
df1 = (df.drop_duplicates()
.groupby(['ID1','ID2','YrMonth'])['Class']
.agg('->'.join).reset_index())
print (df1)
ID1 ID2 YrMonth Class
0 1 p1 Feb-19 PE5->SC
1 1 p2 Feb-19 SC
2 1 p3 Feb-19 EA->PE5
3 1 p4 Feb-19 EA->PE5->SC
If need specify columns for remove duplicates:
df1 = (df.drop_duplicates(['ID1','ID2','YrMonth','Class'])
.groupby(['ID1','ID2','YrMonth'])['Class']
.agg('->'.join).reset_index())
print (df1)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | jezrael |
