'How to detect three consecutive days in Pandas DataFrame?
I have DataFrame with index in Datetime format. I need to keep the days which are adjacent to at least 2 additional days (I mean 3 consecutive days should come together). Please, share your solution.
For example,
Date
2021.11.08 #<-
2021.11.09 #<-
2021.11.10 #<-
2021.11.12
2021.11.13
2021.11.16 #<-
2021.11.17 #<-
2021.11.18 #<-
2021.11.19 #<-
2021.11.22
2021.11.23
<- to be selected
Solution 1:[1]
Try with groupby:
#reset index if "Date" is the index and not a column
#df = df.reset_index()
#convert to datetime
df["Date"] = pd.to_datetime(df["Date"], format="%Y.%m.%d")
#check if adjacent rows are 1 day apart
adjacent = df["Date"].diff().dt.days.fillna(1).eq(1)
#get sequences with a minimum length of 3
mask = df.groupby(adjacent.ne(adjacent.shift()).cumsum())["Date"].transform('count').ge(3)
output = df[mask|mask.shift(-1)]
>>> output
Date
0 2021-11-08
1 2021-11-09
2 2021-11-10
5 2021-11-16
6 2021-11-17
7 2021-11-18
8 2021-11-19
Solution 2:[2]
With groupby filter
oneday = pd.offsets.Day(1)
diff = df.Date.diff().bfill()
df.groupby(
diff.ne(oneday).cumsum()
).filter(lambda d: len(d) > 2)
Date
0 2021-11-08
1 2021-11-09
2 2021-11-10
5 2021-11-16
6 2021-11-17
7 2021-11-18
8 2021-11-19
Solution 3:[3]
Using slicing with a mask:
N=3
# find start of groups
m = ~pd.to_datetime(df['Date']).diff().eq('1d')
# check size and keep if ? N
df[m.groupby(m.cumsum()).transform('size').ge(N)]
Output:
Date
0 2021.11.08
1 2021.11.09
2 2021.11.10
5 2021.11.16
6 2021.11.17
7 2021.11.18
8 2021.11.19
keep every second element
N = 3
m = ~pd.to_datetime(df['Date']).diff().eq('1d')
g = m.groupby(m.cumsum())
m1 = g.transform('size').ge(N)
m2 = g.cumcount().mod(2) # odd lines
df[m1&m2]
Output:
Date
1 2021.11.09
6 2021.11.17
8 2021.11.19
NB. If you only want the second and not every second, use eq(1) in place of mod(2)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | piRSquared |
| Solution 3 |
