'Pandas - Reshape Dataframe
I have the following dataframe:
Time AreaIn AreaOut Output
0 1 Area E Area G 200
1 16 Area E Area G 200
2 31 Area E Area G 200
3 46 Area E Area G 300
4 61 Area E Area G 459
5 ... ... ... ...
93 1396 Area E Area G 600
94 1411 Area E Area G 400
95 1426 Area E Area G 500
96 1441 Area E Area G 500
97 1 Area H Area F 600
98 16 Area H Area F 600
99 31 Area H Area F 600
100 46 Area H Area F 600
101 61 Area H Area F 116
102 ... ... ... ...
189 1381 Area H Area F 111
190 1396 Area H Area F 600
191 1411 Area H Area F 600
192 1426 Area H Area F 400
193 1441 Area H Area F 400
And I want to reshape it. The column 'Time' ranges from 1 to 1441 with 15 interval, but I want it to range from 1 to 1441 with 60 interval. While the 'Output' should be the average of every 4 rows (sum of every 4 rows divided by 4).
In this case the dataframe contains only two time series so the result should look like this:
Time AreaIn AreaOut Output
0 1 Area E Area G 450
1 61 Area E Area G 500
2 121 Area E Area G 600
3 181 Area E Area G 892
4 241 Area E Area G 459
5 ... ... ... ...
21 1261 Area E Area G 810
22 1321 Area E Area G 598
23 1381 Area E Area G 650
24 1441 Area E Area G 250
25 1 Area H Area F 600
26 61 Area H Area F 987
27 121 Area H Area F 0
28 181 Area H Area F 211
29 241 Area H Area F 116
30 ... ... ... ...
44 1201 Area H Area F 111
45 1261 Area H Area F 332
46 1321 Area H Area F 551
47 1381 Area H Area F 726
49 1441 Area H Area F 250
However, I want to implement a generic solution that could work with more than two timeseries.
while using: df = df.groupby(pd.cut(df["Time"], np.arange(1, 1442, 60))).mean() I am getting the following result:
Time Output
Time
(1, 61] 38.5 2351
(61, 121] 98.5 2752
(121, 181] 158.5 4323
(181, 241] 218.5 2523
(241, 301] 278.5 3456
(301, 361] 338.5 1653
(361, 421] 398.5 4361
(421, 481] 458.5 6543
(481, 541] 518.5 3245
(541, 601] 578.5 2434
(601, 661] 638.5 1387
(661, 721] 698.5 4456
(721, 781] 758.5 2534
(781, 841] 818.5 3424
(841, 901] 878.5 2376
(901, 961] 938.5 2656
(961, 1021] 998.5 3456
(1021, 1081] 1058.5 1212
(1081, 1141] 1118.5 3355
(1141, 1201] 1178.5 2466
(1201, 1261] 1238.5 3462
(1261, 1321] 1298.5 2344
(1321, 1381] 1358.5 2453
(1381, 1441] 1418.5 3256
Which groups the outputs of the two different time series and mixes up Area E - Area G with Area H - Area F. So I would like to keep those two time series separately.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
