'Pandas Long to Wide conversion
I am new to Pandas. I have a data set with in this format.
UserID ISBN BookRatings
0 276725.0 034545104U 0.0
1 276726.0 155061224 5.0
2 276727.0 446520802 0.0
3 276729.0 052165615W 3.0
4 276729.0 521795028 6.0
I would like to create this
ISBN 276725 276726 276727 276729
UserID
0 034545104U
1 0 155061224 0 0 0
2 0 0 446520802 0 0
3 0 0 0 052165615W 0
4 0 0 0 521795028 0
I tried pivot but was not successful. Any kind advice please?
Solution 1:[1]
I think that pivot() is the right approach here. The most difficult part is to get the arguments correctly. I think we need to keep the original index and the new columns should be the values in column UserID. Also, we want to fill the new dataframe with the values from column ISBN.
For this, I firstly extract the original index as column and then apply the pivot() function:
df = df.reset_index()
result = df.pivot(index='index', columns='UserID', values='ISBN')
# Make your float columns to integers (only works if all user ids are numbers, drop nan values first)
result.columns = map(int,result.columns)
Output:
276725 276726 276727 276729
index
0 034545104U NaN NaN NaN
1 NaN 155061224 NaN NaN
2 NaN NaN 446520802 NaN
3 NaN NaN NaN 052165615W
4 NaN NaN NaN 521795028
Edit: If you want the same appearance as in the original dataframe you have to apply the following line as well:
result = result.rename_axis(None, axis=0)
Output:
276725 276726 276727 276729
0 034545104U NaN NaN NaN
1 NaN 155061224 NaN NaN
2 NaN NaN 446520802 NaN
3 NaN NaN NaN 052165615W
4 NaN NaN NaN 521795028
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
