'How to calculate total minutes from timestamp with multiple conditions

assume that there is a trip data of a company vehicles.

trips = {'driver':['Tom', 'Max', 'Tom', 'Tom', 'Tom','Tom', 'Tom', 'Jack',  'Tom', 'Tom'],
    'date' : ['2022-01-01 08:02:35',
              '2022-01-01 08:02:35',
              '2022-01-01 08:12:35',
              '2022-01-01 08:17:35',
              '2022-01-01 08:32:35',
             '2022-01-01 08:39:35',
             '2022-01-01 08:59:35',
             '2022-01-01 09:19:22',
            '2022-01-01 09:20:22', 
             '2022-01-01 09:28:22']
     ,
    'speed':[0, 2, 19, 15, 0,  23, 0, 0, 10,0 ],
    'distance_to_company': [0.0, 6.2, 24.4,  28.4, 14.2, 13.7, 0.0, 5.2, 4.2, 0.0 ]}

df = pd.DataFrame(trips)

I'm trying to calculate duration of trips in minutes.

pseudo code of conditions

trip starts at the timestamp:

if (last 'distance_to_company'  < 3.0 AND speed == 0) AND  (next 'distance_to_company' > last 'distance_to_company' AND speed > 0) 

trip ends at the timestamp:

if (last 'distance_to_company'  > 3.0 AND speed > 0) AND  (next 'distance_to_company' < last 'distance_to_company' AND speed == 0)

and calculate total duration of the trips.

output may like:

----------------------------------------------------------------------------------------
driver   |     trip_start          |     trip_end             | trip_duration_(minutes)
  Tom    |  2022-01-01 08:12:35    |  2022-01-01 08:59:35     |         47
  Tom    |  2022-01-01 09:20:22    |  2022-01-01 09:28:22     |          8


Solution 1:[1]

You can use:

# get null speed/distance
m1 = df['speed'].eq(0)
m2 = df['distance_to_company'].eq(0)

# shift the added above masks per driver
# and make groups
g = (m1&m2).groupby(df['driver']).apply(lambda s: s.shift().cumsum())

# ensure datetime
df['date'] = pd.to_datetime(df['date'])

# group by driver + trip and compute start/end/duration
(df
 .groupby(['driver', g])
 .agg(**{'trip_start': ('date', 'min'),
         'trip_end': ('date', 'max'),
         'trip_duration_(minutes)': ('date', lambda x: (x.max()-x.min()).total_seconds()/60)
        })
 .drop level(1) # remove trip group (optional)
 )

Output:

                trip_start            trip_end  trip_duration_(minutes)
driver                                                                 
Tom    2022-01-01 08:12:35 2022-01-01 08:59:35                     47.0
Tom    2022-01-01 09:20:22 2022-01-01 09:28:22                      8.0

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 mozway