'How to evaluate rows and get the max value based on multiple columns

I have a SQL question where I want to evaluate and get the latest work order for a location based on multiple criteria

The table looks something like this

 location   work order         create dt.      status         result  
   1.          123              3/1/22         complete      positive 
   1.          124              3/2/22         incomplete.    null 
   2.          231              2/1/22         cancelled.     null
   2.          232              2/3/22         incomplete.    null 

The requirement is as follows

For each location, find the latest work order based on the following criteria

  1. If there are multiple work orders with results, pick the one with the latest date
  2. If there are multiple work orders but one with result and one with no result, pick the one with the result - even if it is not latest
  3. If there are multiple work orders, but none have result, pick the latest one that is not cancelled
  4. If there are multiple work orders, but all are cancelled, pick the latest one

The result would be something like this

location    work order 
  1.           123
  2.           232

Since for location 1, we pick the earlier one, since it has the result And for location 2, we pick the earlier one, since it is not cancelled.

Thanks



Solution 1:[1]

Here's the source data

SQL> select * from t;

  LOCATION  WORKORDER CREATED    STATUS               RESULT
---------- ---------- ---------- -------------------- --------------------
         1        123 03/01/2022 complete             positive
         1        124 03/02/2022 incomplete
         2        231 02/01/2022 cancelled
         2        232 02/03/2022 incomplete

We can pick up some additional data on a per location basis

SQL>   select
  2      t.*,
  3      max(created) over ( partition by location) as last_date,
  4      count(result) over ( partition by location) result_count,
  5      max(case when result is not null then created end) over ( partition by location) result_date,
  6      max(case when status != 'cancelled' then created end) over ( partition by location) non_cancelled_date
  7    from t
  8  /

  LOCATION  WORKORDER CREATED    STATUS               RESULT               LAST_DATE  RESULT_COUNT RESULT_DAT NON_CANCEL
---------- ---------- ---------- -------------------- -------------------- ---------- ------------ ---------- ----------
         1        123 03/01/2022 complete             positive             03/02/2022            1 03/01/2022 03/02/2022
         1        124 03/02/2022 incomplete                                03/02/2022            1 03/01/2022 03/02/2022
         2        231 02/01/2022 cancelled                                 02/03/2022            0            02/03/2022
         2        232 02/03/2022 incomplete                                02/03/2022            0            02/03/2022

and use that to apply our rules

SQL> select *
  2  from
  3  (
  4    select
  5      t.*,
  6      max(created) over ( partition by location) as last_date,
  7      count(result) over ( partition by location) result_count,
  8      max(case when result is not null then created end) over ( partition by location) result_date,
  9      max(case when status != 'cancelled' then created end) over ( partition by location) non_cancelled_date
 10    from t
 11  )
 12  where ( result_count > 1 and created = result_date )     -- rule1
 13  or    ( result_count = 1 and created = result_date )     -- rule2
 14  or    ( result_count = 0 and non_cancelled_date = created )    -- rule3
 15  or    ( result_count = 0 and non_cancelled_date is null and created = last_date ) -- rule4
 16  /

  LOCATION  WORKORDER CREATED    STATUS               RESULT               LAST_DATE  RESULT_COUNT RESULT_DAT NON_CANCEL
---------- ---------- ---------- -------------------- -------------------- ---------- ------------ ---------- ----------
         1        123 03/01/2022 complete             positive             03/02/2022            1 03/01/2022 03/02/2022
         2        232 02/03/2022 incomplete                                02/03/2022            0            02/03/2022

If you're unfamiliar with these "OVER" functions, here's my tutorial series on them https://www.youtube.com/watch?v=0cjxYMxa1e4&list=PLJMaoEWvHwFIUwMrF4HLnRksF0H8DHGtt

Solution 2:[2]

You can use the ROW_NUMBER analytic function:

SELECT *
FROM   (
  SELECT t.*,
         ROW_NUMBER() OVER (
           PARTITION BY location
           ORDER BY
             CASE
             WHEN result IS NOT NULL   THEN 0
             WHEN status = 'cancelled' THEN 2
                                       ELSE 1
             END ASC,
             create_dt DESC
         ) AS rn
  FROM   table_name t
)
WHERE  rn = 1;

Which, for the sample data:

CREATE TABLE table_name (location, work_order, create_dt, status, result) AS
SELECT 1, 123, DATE '2022-01-03', 'complete',   'positive' FROM DUAL UNION ALL
SELECT 1, 124, DATE '2022-02-03', 'incomplete', null FROM DUAL UNION ALL
SELECT 2, 231, DATE '2022-01-02', 'cancelled',  null FROM DUAL UNION ALL
SELECT 2, 232, DATE '2022-03-02', 'incomplete', null FROM DUAL;

Outputs:

LOCATION WORK_ORDER CREATE_DT STATUS RESULT RN
1 123 03-JAN-22 complete positive 1
2 232 02-MAR-22 incomplete null 1

db<>fiddle here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Connor McDonald
Solution 2 MT0