'Partition by weeknumber and next weeknumber

I would like to count for each customer how many times a specific product was purchased in the past. I want to highlight the purchases for the same product (where the second order date is close to the first order date) with a rn = 2, so I can only count the rows with rn = 1

I created the following query and also included the current output. Its containing a partition by week number, to filter out purchases for the same product in the same week. It's working quite good, but the behaviour is not exactly what I was hoping for.

create table  sandbox.hm_orders as 
select o.customer_id
,o.product_id
,o.order_date
,ROW_NUMBER() over (partition by o.customer_id, o.product_id,concat(EXTRACT(year FROM order_date),EXTRACT(week FROM order_date)) order by o.order_date asc) as rn
,concat(EXTRACT(year FROM o.order_date),'_',EXTRACT(week FROM o.order_date)) as weeknr
    from datamarts.orders o
    where o.label_id = 1
    and o.order_date > '2020-01-01'
    and o.payment_status = 'PAID'

Current output:

customer_id product_ID order_date rn weeknr
4708818 128703 2020-05-11 20:19:25 1 2020_20
4708818 128703 2020-05-12 22:13:09 2 2020_20
4708818 128703 2020-06-06 21:45:04 1 2020_23
4708818 274578 2020-07-02 22:02:10 1 2020_27
4753958 137482 2021-03-14 18:13:04 1 2021_10
4753958 137482 2021-03-15 17:29:03 1 2021_11

As you can see in first two rows, the difference between the first the rows is 1 day and it will mark the second row with a rowNumber 2. For the last 2 rows, the difference between the orders is also 1 day. But since the weeknumbers are different, it will not give the second row a rowNumber 2.

Therefore I would like to find a way to also include the next weeknumber for the partition by. In this case, the order that have been done in 2021-11 needs a row number 2, and the week number 10 needs row number 1

desired output

customer_id product_ID order_date rn weeknr
4708818 128703 2020-05-11 20:19:25 1 2020_20
4708818 128703 2020-05-12 22:13:09 2 2020_20
4708818 128703 2020-06-06 21:45:04 1 2020_23
4708818 274578 2020-07-02 22:02:10 1 2020_27
4753958 137482 2021-03-14 18:13:04 1 2021_10
4753958 137482 2021-03-15 17:29:03 2 2021_11


Solution 1:[1]

A bit complicated way is to calculate a ranking by summing over a calculated flag.
Then use the rank in the row_number.

SELECT *
, ROW_NUMBER() over (partition by customer_id, product_id, rnk order by order_date asc) as rn
, TO_CHAR(order_date, 'YYYY_WW') as weeknr
FROM
(
SELECT *
, SUM(flag) over (partition by customer_id, product_id order by order_date asc) as rnk
FROM
(
  SELECT 
    o.customer_id
  , o.product_id
  , o.order_date
  , CASE WHEN 1 >= DATE_PART('day', o.order_date - LAG(o.order_date) over (partition by o.customer_id, o.product_id order by o.order_date asc))
    THEN 0 ELSE 1 END AS flag
  FROM orders o
  WHERE o.label_id = 1
    AND o.order_date > '2020-01-01'
    AND o.payment_status = 'PAID'
) q1
) q2
customer_id product_id order_date flag rnk rn weeknr
4708818 128703 2020-05-11 20:19:25 1 1 1 2020_19
4708818 128703 2020-05-12 22:13:09 0 1 2 2020_19
4708818 128703 2020-06-06 21:45:04 1 2 1 2020_23
4708818 274578 2020-07-02 22:02:10 1 1 1 2020_27
4753958 137482 2021-03-14 18:13:04 1 1 1 2021_11
4753958 137482 2021-03-15 17:29:03 0 1 2 2021_11

Test on db<>fiddle here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1