'Multiple conditions (A and B) and not (C and D). How should the SQL query look like?

Let's say we have a e-shop and all users' events in a single table. The events table looks like this:

id user_id type             time             comment
1  1       visit_site       2022-04-07 14:08
1  1       add_item_to_cart 2022-04-07 14:17
1  1       add_item_to_cart 2022-04-07 14:17
1  1       checkout         2022-04-07 14:22
1  1       pay              2022-04-07 14:25
1  2       visit_site       2022-04-07 14:30
1  2       add_item_to_cart 2022-04-07 14:40
1  2       add_item_to_cart 2022-04-07 14:44
1  2       checkout         2022-04-07 14:47
1  2       pay              2022-04-07 14:50

So there are multiple event types, they have timestamp (when it happened) and user that is the actor.

Let's say I want to find all users that did add items to cart and did buy on the next day. I would assume the SQL query should be

SELECT DISTINCT
  user_id
FROM
  user_event
WHERE
  (type = 'add_item_to_cart' AND time BETWEEEN '2022-04-07 00:00' AND '2022-04-08 00:00') AND
  (type = 'buy' AND time BETWEEEN '2022-04-08 00:00' AND '2022-04-09 00:00')

Now, I understand that the above condition is basically equal to:

WHERE
  type = 'add_item_to_cart' AND time BETWEEEN '2022-04-07 00:00' AND '2022-04-08 00:00' AND
  type = 'buy' AND time BETWEEEN '2022-04-08 00:00' AND '2022-04-09 00:00'

which will return always empty results because we apply two conditions to column time with date ranges that don't intersect.

So I have 3 questions

  1. How do I rewrite the query so that I get customers that added item to cart in 1 date range and bought in another date range ?
  2. The above condition actually would match purchases made almost 2 days from adding to cart, i.e. '2022-04-07 00:01' - added to cart, '2022-04-08 23:58' - bought. How do I make a condition related to previous one, i.e. match buys strictly less than 1 day from last date of checkout (with matching user id) ?
  3. Is there a way to force that events are related to the same user, so that query doesn't return user who bought item <1 day after someone else (with different user_id) added an item to cart ?
sql


Solution 1:[1]

Instead of using a WHERE statement within the same SELECT, you can try applying a SELF JOIN on the transactions made by the same users on a difference of one day:

SELECT DISTINCT
    t1.id,
    t1.user_id,
    t1.time AS time1,
    t1.type AS type1,
    t2.time AS time2,
    t2.type AS type2
FROM 
    user_event t1 
INNER JOIN 
    user_event t2
ON 
    t1.user_id = t2.user_id
AND 
    DATEDIFF(t2.time, t1.time) = 0
WHERE 
    t1.type = 'add_item_to_cart'
AND 
    t2.type = 'pay'

This solution answers to all your questions. You can find a fiddle here: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=3066b711f72ce7444859be1a824d9eb3. The input rows of this fiddle contain three kinds of users:

  1. user who bought the same day,
  2. user who bought the day after (targeted user),
  3. user who bought two days after.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1