'Pandas Intervals that Overlaps with the next row
I have this problem to group overlapping intervals, the data is sorted, only need to find and group whether there are overlapping intervals between a row with the next one to it, not overlapping on all rows.
ID start end
1 01-04-2011 01-04-2011
1a 01-04-2011 30-09-2011
2 01-01-2012 31-03-2012
3 01-04-2012 31-10-2012
4 01-11-2012 31-03-2013
6 01-04-2013 31-10-2013
6a 01-10-2013 31-03-2014
7 01-04-2014 31-10-2014
9 01-11-2014 31-03-2015
10 01-04-2015 31-05-2015
11 01-06-2015 31-10-2015
12 01-11-2015 31-03-2016
13 01-10-2016 31-03-2017
14 01-04-2017 30-09-2017
ID1 start and end are the same means that it has no end yet. I need to determine are ID1 and ID1a are overlapping, if not then are ID1a and ID3 overlapping? and so on. (ID is oversimplified)
I've searched everywhere and can't seem to solve this.
the expected result, the group will always consist of 2 intervals and they'll always be next to each other since the CSV is already sorted.
ID start end overlap
1 01-04-2011 01-04-2011 Y_group1
1a 01-04-2011 30-09-2011 Y_group1
ID start end overlap
6 01-04-2013 31-10-2013 Y_group2
6a 01-10-2013 31-03-2014 Y_group2
but I got the same error ValueError: need at least one array to concatenate
I found this solution but it returns so many True, maybe because of ID1?
intervals = df.apply(lambda row: pd.Interval(row['start'], row['end']), axis=1)
overlaps = [
(i, j, x, y, x.overlaps(y))
for ((i,x),(j,y))
in itertools.product(enumerate(intervals), repeat=2)
]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
