'Inner join with special conditions
Given an hourly table A with full records, e.g.:
User Hour Purchase
Joe 1 0
Joe 2 0
Joe 3 0
Joe 4 1
Joe 5 0
Joe 6 0
Joe 7 0
Joe 8 1
Joe 9 1
Joe 10 0
And a subset B of it, e.g.
User Hour Purchase
Joe 3 0
Joe 9 1
Joe 10 0
I want to keep only those records from A that are in B or at most 2hr behind the B subset, without duplication, e.g.
User Hour Purchase
Joe 1 0
Joe 2 0
Joe 3 0
Joe 7 0
Joe 8 0
Joe 9 1
Joe 10 0
How can the result be achieved with an inner join, without duplication (in this case the hours 8&9) and preserving the right purchase values for the hours in B? (This is an MWE, assume multiple users and timestamps instead of hours)
Solution 1:[1]
Consider below simple approach
select * from tableA a
where exists (
select 1 from tableB b
where a.hour between b.hour - 2 and b.hour
and a.user = b.user
)
if applied to sample in your question - output is
I hope in your real case you have datetime or timestamp instead of hour column, so in this case you will need slightly modify where a.hour between b.hour - 2 and b.hour part of above. it will look something like
where a.datetime between datetime_sub(b.datetime, interval 2 hour) and b.datetime
Solution 2:[2]
Try this
with _data as
(
select 'Joe' as _user, 1 as _hour,0 as purchase union all
select 'Joe' as _user, 2 as _hour,0 as purchase union all
select 'Joe' as _user, 3 as _hour,0 as purchase union all
select 'Joe' as _user, 4 as _hour,1 as purchase union all
select 'Joe' as _user, 5 as _hour,0 as purchase union all
select 'Joe' as _user, 6 as _hour,0 as purchase union all
select 'Joe' as _user, 7 as _hour,0 as purchase union all
select 'Joe' as _user, 8 as _hour,1 as purchase union all
select 'Joe' as _user, 9 as _hour,1 as purchase union all
select 'Joe' as _user,10 as _hour,0 as purchase
)
,subset as
(
select 'Joe' as _user, 3 as _hour,0 as purchase union all
select 'Joe' as _user, 9 as _hour,1 as purchase union all
select 'Joe' as _user,10 as _hour,0 as purchase
)
select a._user,a._hour,any_value(b.purchase) from _data a join subset b on
(
a._user = b._user and
(
a._hour = b._hour
or
case when b._hour > a._hour then (case when (b._hour - a._hour) <=2 then 1=1 end) end)
)
group by a._user,a._hour
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Mikhail Berlyant |
| Solution 2 | Mr.Batra |

