'SQL Query to filter duplicates based on a condition
Consider the following scenario:
| VisitID | Person | DateVisited |
|---|---|---|
| 1 | A | 2022-02-01 |
| 2 | B | 2022-02-03 |
| 3 | C | 2022-02-05 |
| 4 | A | 2022-02-09 |
| 5 | D | 2022-02-10 |
| 6 | E | 2022-02-11 |
| 7 | B | 2022-02-15 |
| 8 | A | 2022-02-18 |
| 9 | E | 2022-02-20 |
| 10 | F | 2022-02-22 |
| 11 | C | 2022-02-25 |
| 12 | G | 2022-02-28 |
I want to get the count of the number of people who visited my store.
Conditions:
After a person visits my store once, I only want to count his visit if he had visited at least after 15 days (15th-day visit counts).
Example : A visited my store on 2022-02-01 , 2022-02-09, 2022-02-18 This should only count as 2 visits as 02-01 and 02-09 were not at least 15 days apart.
For my example above, the count for each person should be:
A -> 2 visits on 02-01 & 02-18
B - 1 visit on 02-03 [02-15 doesn't count]
C -> 2 visit on 02-05 & 02-25
D-> 1 visit on 02-09
E -> 1 visit on 02-11 [02-20 doesn't count]
F -> 1 visit on 02-22
G -> 1 visit on 02-28
So, my SQL query should return a total of 9 (2+1+2+1+1+1+1) visits for February 2022.
Thanks a lot!
Solution 1:[1]
You can find the first visit for each person and then join all subsequent visits on to the original visit results:
select v1.* from (select person p, min(datevisited) m from visits group by person) v
join visits v1 on v.p = v1.person and (v.m = v1.datevisited or v1.datevisited >= v.m + interval '15 days')
order by v1.person
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Ajax1234 |
