'How to pivot a DataFrame creating new columns, considering the max item repeated
I have the next pd.DataFrame:
Index ID Name Date Days
1 1 Josh 5-1-20 10
2 1 Josh 9-1-20 10
3 1 Josh 19-1-20 6
4 2 Mike 1-1-20 10
5 3 George 1-4-20 10
6 4 Rose 1-2-20 10
7 4 Rose 11-5-20 5
8 5 Mark 1-9-20 10
9 6 Joe 1-4-21 10
10 7 Jill 1-1-21 10
I'm needing to make a DataFrame where the ID is not repeated, for that, I want to creat new columns (Date y Days), considering the case with most repeatitions (3 in this case).
The desired output is the next DataFrame:
Index ID Name Date 1 Date 2 Date 3 Days1 Days2 Days3
1 1 Josh 5-1-20 9-1-20 19-1-20 10 10 6
2 2 Mike 1-1-20 10
3 3 George 1-4-20 10
4 4 Rose 1-2-20 11-5-20 10 5
5 5 Mark 1-9-20 10
6 6 Joe 1-4-21 10
7 7 Jill 1-1-21 10
Solution 1:[1]
Here is a solution using pivot with a helper column:
df2 = (df
.assign(col=df.groupby('ID').cumcount().add(1).astype(str))
.pivot(index=['ID','Name'], columns='col', values=['Date', 'Days'])
.fillna('')
)
df2.columns = df2.columns.map('_'.join)
df2.reset_index()
Output:
ID Name Date_1 Date_2 Date_3 Days_1 Days_2 Days_3
0 1 Josh 5-1-20 9-1-20 19-1-20 10 10 6
1 2 Mike 1-1-20 10
2 3 George 1-4-20 10
3 4 Rose 1-2-20 11-5-20 10 5
4 5 Mark 1-9-20 10
5 6 Joe 1-4-21 10
6 7 Jill 1-1-21 10
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | mozway |
