'Pandas create counter column for group but reset count based on multiple conditions

I have the following Dataframe:

Worker  dt_diff          same_employer  same_role
1754    0 days 00:00:00  False          False
2951    0 days 00:00:00  False          False
2951    1 days 00:00:00  True           True
2951    1 days 01:00:00  True           True
3368    0 days 00:00:00  False          False
3368    7 days 00:00:00  True           True
3368    7 days 00:00:00  True           True
3368    7 days 00:00:00  True           True
3368    7 days 00:00:00  True           True
3368    7 days 00:00:00  True           True
3539    0 days 00:00:00  False          False
3539    1 days 00:00:00  True           True
3539    1 days 00:00:00  True           True
3539    3 days 00:30:00  False          False
3539    1 days 00:00:00  True           True
3539    2 days 06:00:00  False          True

I would like to create a new column containing continuity counter grouped by worker. However the counter will be based on the following conditions:

if (dt_diff > 6days) or (same_employer == False) or (same_role == False) then reset the counter

So for the above dataframe i would expect result as below:

Worker  Counter
1754    1
2951    3
3368    1
3539    3


Solution 1:[1]

I expect your expected counter for the worker 3539 to be 1 because the last row should have reset it.

Your condition:

s =  ~((df['dt_diff'].dt.days > 6) | (df['same_employer'] == False) | (df['same_role'] == False))

The key is to count from the last row up to the last row that does not satisfy your condition, and we can create a mask for that by:

y = s[::-1].groupby(df['Worker']).cumprod()

then we sum over the mask, but adding 1 at last

print(y.groupby(df['Worker']).sum()+1)

Worker
1754    1
2951    3
3368    1
3539    1
dtype: int64

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Raymond Kwok