'Postgres Consecutive Days, gaps and islands, Tabibitosan

SQL FIDDLE

I have the following database table:

date name
2014-08-10 bob
2014-08-10 sue
2014-08-11 bob
2014-08-11 mike
2014-08-12 bob
2014-08-12 mike
2014-08-05 bob
2014-08-06 bob
SELECT t.Name,COUNT(*) as frequency
FROM (
    SELECT Name,Date,
            row_number() OVER (
            ORDER BY Date
            ) - row_number() OVER (
            PARTITION BY Name ORDER BY Date
            ) + 1 seq
    FROM orders
    ) t
GROUP BY Name,seq;

Tried running the Tabibitosan method of finding gaps and islands produces the below table which is incorrect. The name "mike" should actually have a count of 2 since the 11th and 12th days are consecutive. How do I fix this?

name frequency
mike 1
bob 3
bob 2
mike 1
sue 1

Correct expected output below:

name frequency
bob 3
bob 2
mike 2
sue 1


Solution 1:[1]

You are using the wrong logic. Basically, you want dates that are sequential, so you want to subtract the sequence from the date:

SELECT t.Name, COUNT(*) as frequency
FROM (SELECT o.*,
             row_number() OVER (PARTITION BY Name ORDER BY Date) as seqnum
      FROM orders o
     ) t
GROUP BY Name, date - seqnum * interval '1 day';

Here is a db<>fiddle.

Solution 2:[2]

Gaps and Islands problem solved in Postgresql:

Run this working demo example:

drop table if exists foobar; 
CREATE TABLE foobar( tick text, date_val date ); 
insert into foobar values('XYZ', '2021-01-03');  --island 1 has width 2
insert into foobar values('XYZ', '2021-01-04');  --island 1
insert into foobar values('XYZ', '2021-05-09');  --island 2 has width 3
insert into foobar values('XYZ', '2021-05-10');  --island 2 
insert into foobar values('XYZ', '2021-05-11');  --island 2
insert into foobar values('XYZ', '2021-07-07');  --island 3 has width 4
insert into foobar values('XYZ', '2021-07-08');  --island 3
insert into foobar values('XYZ', '2021-07-09');  --island 3
insert into foobar values('XYZ', '2021-07-10');  --island 3 
insert into foobar values('XYZ', '2022-10-10');  --island 4 has width 1


select tick, island_width, min_val, max_val, 
       min_val - lag(max_val) over (order by max_val) 
       as gap_width from  
( 
  select tick, count(*) as island_width, 
         min(date_val) min_val, max(date_val) max_val 
  from ( 
    select t.*, 
    row_number() over ( partition by tick order by date_val ) as seqnum 
    from foobar t where tick = 'XYZ' 
    ) t 
  group by tick, date_val - seqnum * interval '1 day' 
) t2 order by max_val desc

Prints:

????????????????????????????????????????????????????????????? 
? tick ? island_width ?  min_val   ?  max_val   ? gap_width ? 
????????????????????????????????????????????????????????????? 
? XYZ  ?            1 ? 2022-10-10 ? 2022-10-10 ?       457 ? 
? XYZ  ?            4 ? 2021-07-07 ? 2021-07-10 ?        57 ? 
? XYZ  ?            3 ? 2021-05-09 ? 2021-05-11 ?       125 ? 
? XYZ  ?            2 ? 2021-01-03 ? 2021-01-04 ?         ยค ? 
????????????????????????????????????????????????????????????? 

The column island_width gives the width of continuous data. The gap_width gives you the width of missing data.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Gordon Linoff
Solution 2 Eric Leschinski