'Pandas time series time between followed events

i have a dataframe as the follow:

Date                  User  
2-23-2022 09:00:00    u1
2-23-2022 09:01:00    u2
2-24-2022 07:00:00    u1
2-24-2022 09:00:00    u1
2-24-2022 12:00:00    u2

And i want to calculate, for every column, the time elapsed between the same user's events:

Date                  User      output
2-23-2022 09:00:00    u1        NaN
2-23-2022 09:00:01    u2        NaN
2-24-2022 07:00:00    u1        22 hours (time object)
2-24-2022 09:00:00    u1        2 hours (time object)
2-24-2022 12:00:00    u2        27 hours (time object)

The output doesn't have to be in hours, can be in seconds (etc)



Solution 1:[1]

Try with groupby:

To keep the output in Timestamp format:

df["output"] = df.groupby("User")["Date"].diff()

>>> df
                 Date User          output
0 2022-02-23 09:00:00   u1             NaT
1 2022-02-23 09:01:00   u2             NaT
2 2022-02-24 07:00:00   u1 0 days 22:00:00
3 2022-02-24 09:00:00   u1 0 days 02:00:00
4 2022-02-24 12:00:00   u2 1 days 02:59:00

To convert to hours (int)

df["output"] = df.groupby("User")["Date"].diff().dt.total_seconds().div(3600)

>>> df
                 Date User     output
0 2022-02-23 09:00:00   u1        NaN
1 2022-02-23 09:01:00   u2        NaN
2 2022-02-24 07:00:00   u1  22.000000
3 2022-02-24 09:00:00   u1   2.000000
4 2022-02-24 12:00:00   u2  26.983333

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 not_speshal