'Multiple conditions (A and B) and not (C and D). How should the SQL query look like?
Let's say we have a e-shop and all users' events in a single table.
The events table looks like this:
id user_id type time comment
1 1 visit_site 2022-04-07 14:08
1 1 add_item_to_cart 2022-04-07 14:17
1 1 add_item_to_cart 2022-04-07 14:17
1 1 checkout 2022-04-07 14:22
1 1 pay 2022-04-07 14:25
1 2 visit_site 2022-04-07 14:30
1 2 add_item_to_cart 2022-04-07 14:40
1 2 add_item_to_cart 2022-04-07 14:44
1 2 checkout 2022-04-07 14:47
1 2 pay 2022-04-07 14:50
So there are multiple event types, they have timestamp (when it happened) and user that is the actor.
Let's say I want to find all users that did add items to cart and did buy on the next day. I would assume the SQL query should be
SELECT DISTINCT
user_id
FROM
user_event
WHERE
(type = 'add_item_to_cart' AND time BETWEEEN '2022-04-07 00:00' AND '2022-04-08 00:00') AND
(type = 'buy' AND time BETWEEEN '2022-04-08 00:00' AND '2022-04-09 00:00')
Now, I understand that the above condition is basically equal to:
WHERE
type = 'add_item_to_cart' AND time BETWEEEN '2022-04-07 00:00' AND '2022-04-08 00:00' AND
type = 'buy' AND time BETWEEEN '2022-04-08 00:00' AND '2022-04-09 00:00'
which will return always empty results because we apply two conditions to column time with date ranges that don't intersect.
So I have 3 questions
- How do I rewrite the query so that I get customers that added item to cart in 1 date range and bought in another date range ?
- The above condition actually would match purchases made almost 2 days from adding to cart, i.e. '2022-04-07 00:01' - added to cart, '2022-04-08 23:58' - bought. How do I make a condition related to previous one, i.e. match buys strictly less than 1 day from last date of checkout (with matching user id) ?
- Is there a way to force that events are related to the same user, so that query doesn't return user who bought item <1 day after someone else (with different
user_id) added an item to cart ?
Solution 1:[1]
Instead of using a WHERE statement within the same SELECT, you can try applying a SELF JOIN on the transactions made by the same users on a difference of one day:
SELECT DISTINCT
t1.id,
t1.user_id,
t1.time AS time1,
t1.type AS type1,
t2.time AS time2,
t2.type AS type2
FROM
user_event t1
INNER JOIN
user_event t2
ON
t1.user_id = t2.user_id
AND
DATEDIFF(t2.time, t1.time) = 0
WHERE
t1.type = 'add_item_to_cart'
AND
t2.type = 'pay'
This solution answers to all your questions. You can find a fiddle here: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=3066b711f72ce7444859be1a824d9eb3. The input rows of this fiddle contain three kinds of users:
- user who bought the same day,
- user who bought the day after (targeted user),
- user who bought two days after.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
