'How to group by the date and form single row from rest of the values? [duplicate]
I have a problem. I have a table like this with roughly 500k rows and 190 unique dates:
| Date | Return |
|---|---|
| 1 | 0.1 |
| 2 | 0.2 |
| 1 | 0.3 |
| 2 | 0.5 |
| 1 | 0.4 |
| 2 | 0.3 |
And I have to format it so that its grouped by date as an index and all returns are on the one row like this:
| Date | Return | Return 2 | Return 3 |
|---|---|---|---|
| 1 | 0.1 | 0.3 | 0.4 |
| 2 | 0.2 | 0.5 | 0.3 |
or the same but dates as columns
and preferably as numpy array. How to do that? ATM my Jupyter just crashes as my solution is too heavy.
Solution 1:[1]
Use pivot_table:
out = df.assign(col='Return ' + df.groupby('Date').cumcount().add(1).astype(str)) \
.pivot_table('Return', 'Date', 'col').reset_index().rename_axis(None, axis=1)
print(out)
# Output
Date Return 1 Return 2 Return 3
0 1 0.1 0.3 0.4
1 2 0.2 0.5 0.3
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Corralien |
