'Group periodic data in pandas dataframe
I have a pandas dataframe that looks like this:
| idx | A | B |
|---|---|---|
| 01/01/01 00:00:01 | 5 | 2 |
| 01/01/01 00:00:02 | 4 | 5 |
| 01/01/01 00:00:03 | 5 | 4 |
| 02/01/01 00:00:01 | 3 | 8 |
| 02/01/01 00:00:02 | 7 | 4 |
| 02/01/01 00:00:03 | 1 | 3 |
I would like to group data based on its periodicity such that the final dataframe is:
| new_idx | 01/01/01 | 02/01/01 | old_column |
|---|---|---|---|
| 00:00:01 | 5 | 3 | A |
| 00:00:02 | 4 | 7 | A |
| 00:00:03 | 5 | 1 | A |
| 00:00:01 | 2 | 8 | B |
| 00:00:02 | 5 | 4 | B |
| 00:00:03 | 4 | 3 | B |
Is there a way to this that holds when the first dataframe gets big (more columns, more periods and more samples)?
Solution 1:[1]
One way is to melt the DataFrame, then split the datetime to dates and times; finally pivot the resulting DataFrame for the final output:
df = df.melt('idx', var_name='old_column')
df[['date','new_idx']] = df['idx'].str.split(expand=True)
out = df.pivot(['new_idx','old_column'], 'date', 'value').reset_index().rename_axis(columns=[None]).sort_values(by='old_column')
Output
new_idx old_column 01/01/01 02/01/01
0 00:00:01 A 5 3
2 00:00:02 A 4 7
4 00:00:03 A 5 1
1 00:00:01 B 2 8
3 00:00:02 B 5 4
5 00:00:03 B 4 3
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
