'Pandas time series time between followed events
i have a dataframe as the follow:
Date User
2-23-2022 09:00:00 u1
2-23-2022 09:01:00 u2
2-24-2022 07:00:00 u1
2-24-2022 09:00:00 u1
2-24-2022 12:00:00 u2
And i want to calculate, for every column, the time elapsed between the same user's events:
Date User output
2-23-2022 09:00:00 u1 NaN
2-23-2022 09:00:01 u2 NaN
2-24-2022 07:00:00 u1 22 hours (time object)
2-24-2022 09:00:00 u1 2 hours (time object)
2-24-2022 12:00:00 u2 27 hours (time object)
The output doesn't have to be in hours, can be in seconds (etc)
Solution 1:[1]
Try with groupby
:
To keep the output in Timestamp
format:
df["output"] = df.groupby("User")["Date"].diff()
>>> df
Date User output
0 2022-02-23 09:00:00 u1 NaT
1 2022-02-23 09:01:00 u2 NaT
2 2022-02-24 07:00:00 u1 0 days 22:00:00
3 2022-02-24 09:00:00 u1 0 days 02:00:00
4 2022-02-24 12:00:00 u2 1 days 02:59:00
To convert to hours (int
)
df["output"] = df.groupby("User")["Date"].diff().dt.total_seconds().div(3600)
>>> df
Date User output
0 2022-02-23 09:00:00 u1 NaN
1 2022-02-23 09:01:00 u2 NaN
2 2022-02-24 07:00:00 u1 22.000000
3 2022-02-24 09:00:00 u1 2.000000
4 2022-02-24 12:00:00 u2 26.983333
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | not_speshal |