'How to detect three consecutive days in Pandas DataFrame?

I have DataFrame with index in Datetime format. I need to keep the days which are adjacent to at least 2 additional days (I mean 3 consecutive days should come together). Please, share your solution.

For example,

Date
2021.11.08 #<-
2021.11.09 #<-
2021.11.10 #<-
2021.11.12
2021.11.13
2021.11.16 #<-
2021.11.17 #<-
2021.11.18 #<-
2021.11.19 #<-
2021.11.22
2021.11.23

<- to be selected



Solution 1:[1]

Try with groupby:

#reset index if "Date" is the index and not a column
#df = df.reset_index()

#convert to datetime
df["Date"] = pd.to_datetime(df["Date"], format="%Y.%m.%d")

#check if adjacent rows are 1 day apart
adjacent = df["Date"].diff().dt.days.fillna(1).eq(1)

#get sequences with a minimum length of 3
mask = df.groupby(adjacent.ne(adjacent.shift()).cumsum())["Date"].transform('count').ge(3)
output = df[mask|mask.shift(-1)]

>>> output
        Date
0 2021-11-08
1 2021-11-09
2 2021-11-10
5 2021-11-16
6 2021-11-17
7 2021-11-18
8 2021-11-19

Solution 2:[2]

With groupby filter

oneday = pd.offsets.Day(1)
diff = df.Date.diff().bfill()

df.groupby(
    diff.ne(oneday).cumsum()
).filter(lambda d: len(d) > 2)


        Date
0 2021-11-08
1 2021-11-09
2 2021-11-10
5 2021-11-16
6 2021-11-17
7 2021-11-18
8 2021-11-19

Solution 3:[3]

Using slicing with a mask:

N=3

# find start of groups
m = ~pd.to_datetime(df['Date']).diff().eq('1d')

# check size and keep if ? N
df[m.groupby(m.cumsum()).transform('size').ge(N)]

Output:

         Date
0  2021.11.08
1  2021.11.09
2  2021.11.10
5  2021.11.16
6  2021.11.17
7  2021.11.18
8  2021.11.19
keep every second element
N = 3
m = ~pd.to_datetime(df['Date']).diff().eq('1d')

g = m.groupby(m.cumsum())

m1 = g.transform('size').ge(N)
m2 = g.cumcount().mod(2) # odd lines

df[m1&m2]

Output:

         Date
1  2021.11.09
6  2021.11.17
8  2021.11.19

NB. If you only want the second and not every second, use eq(1) in place of mod(2)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 piRSquared
Solution 3