'Checking and filtering the first event of each group in Pandas

I have an event dataset. First of all, I grouped my dataset by user_id and location with pandas library. Then I order my each group by created_at. I want to delete all group if the group does not start with 'Open' event.

My dataset:

user_id	location	event_name	created_at
A1	AAA	Event5	9/1/2020 9:06:52 PM
A1	AAA	Event6	9/1/2020 9:07:02 PM
A1	AAA	Open	9/1/2020 9:07:12 PM
A1	AAA	Event1	9/1/2020 9:07:22 PM
A1	AAA	Event2	9/1/2020 9:07:32 PM
A1	AAA	Event3	9/1/2020 9:07:42 PM
A1	AAA	Event2	9/1/2020 9:07:52 PM
A1	AAA	Open	9/1/2020 9:08:02 PM
A1	AAA	Event1	9/1/2020 9:08:12 PM
A1	AAA	Event2	9/1/2020 9:08:22 PM
A2	AAA	Event9	9/1/2020 9:07:22 PM

I want:

user_id	location	event_name	created_at
A1	AAA	Open	9/1/2020 9:07:12 PM
A1	AAA	Event1	9/1/2020 9:07:22 PM
A1	AAA	Event2	9/1/2020 9:07:32 PM
A1	AAA	Event3	9/1/2020 9:07:42 PM
A1	AAA	Open	9/1/2020 9:08:02 PM
A1	AAA	Event1	9/1/2020 9:08:12 PM
A1	AAA	Event2	9/1/2020 9:08:22 PM

Thank you.

Solution 1:^[1]

In short:

df[~(df['event_name'] != 'Open').groupby([df['user_id'], df['location']]).cumprod()]

Explanation:

(df['event_name'] != 'Open') gives us a series of True/False which can also be understood as 1/0 when interpreted as numbers. cumprod apply cumulative product to the True/False series so that as soon as it sees an False (i.e. 0 or 'Open'), it returns a zero and remain zero thereafter.

So we may use zero as the indicator of rows that we wanna to keep.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1

'Checking and filtering the first event of each group in Pandas

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]