'Pandas dataframe.resample multiple columns: max on one column, select corresponding values on another, and mean on others
I have a dataframe with several variables:
tagdata.head()
Out[128]:
Depth Temperature ... Ay Az
Time ...
2017-09-25 21:46:05 23.0 7.70 ... 0.054688 -0.691406
2017-09-25 21:46:10 24.5 6.15 ... 0.148438 -0.742188
2017-09-25 21:46:15 27.5 4.10 ... -0.078125 -0.875000
2017-09-25 21:46:20 29.0 2.55 ... 0.144531 -0.664062
2017-09-25 21:46:25 30.0 2.45 ... 0.343750 -0.886719
[5 rows x 6 columns]
I want to resample every 24H, select 1) the maximum Depth within 24H, 2) the value of temperature that corresponds to that maximum depth 3) the 24H mean for the last two columns, Ay and Az.
So far I have use the code below and it works but I would like to make the last two lines cleaner into one if possible.
Thanks!
tagdata_dailydepthmax = tagdata.resample('24H').apply(lambda tagdata: tagdata.loc[tagdata.Depth.idxmax()])
tagdata_dailydepthmax.Ay = tagdata['Ay'].resample('24H').mean()
tagdata_dailydepthmax.Az = tagdata['Az'].resample('24H').mean()
Solution 1:[1]
You can try this. It calculates mean for multiple columns
tagdata_dailydepthmax[['Ay','Az']] = tagdata[['Ay','Az']].resample('24H').mean()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Manjunath K Mayya |
