'Making a conditional aggregate

I have tricky grouping problem for our business reasons, I have a table which has values like this


----------------------------
| NAME  |  TYPE   |  VALUE  |
----------------------------
|   N1  |   T1    |   V1    |
|   N1  |   T2    |   V2    |
|   N1  |  NULL   |   V3    |
|   N2  |   T2    |   V4    |
|   N2  |  NULL   |   V5    |
|   N3  |  NULL   |   V6    |
-----------------------------

I need to group it in a way that,

  • The first level grouping will be by name.
  • At the second level,
    • When the available types are T1,T2 and NULL, group T1 and NULL together and have T2 grouped seperately.
    • When the available types are T2 and NULL, group NULL with T2.
    • When NULL is the only available type, just have it as it is.

The expected O/P for the above table is,

----------------------------
|   N1  |   T1    | V1+V3   |
|   N1  |   T2    |   V2    |
|   N2  |   T2    | V4+V5   |
|   N3  |  NULL   |   V6    |
-----------------------------

How to achieve this in snowflake sql. Or any other server, so that I can find an equivalent in Snowflake.



Solution 1:[1]

The following query should work:

SELECT t1.NAME, COALESCE(TYPE, MIN_TYPE), SUM(VALUE)
FROM mytable AS t1
JOIN (
   SELECT NAME, MIN(TYPE) AS MIN_TYPE
   FROM mytable
   GROUP BY NAME
) AS t2 ON t1.NAME = t2.NAME    
GROUP BY t1.NAME, COALESCE(TYPE, MIN_TYPE)

The query uses a derived table in order to extract the MIN(TYPE) value per NAME. Using COALESCE we can then convert NULL to either T1 or T2.

Edit:

You can create a pivoted version of the expected result set using the following query:

SELECT NAME, 
       CASE 
          WHEN T1SUM IS NULL THEN 0
          ELSE COALESCE(T1SUM, 0) + COALESCE(NULLSUM,0)
       END AS T1SUM,
       CASE 
          WHEN T1SUM IS NULL AND T2SUM IS NOT NULL 
             THEN COALESCE(T2SUM, 0) + COALESCE(NULLSUM,0)
          ELSE COALESCE(T2SUM, 0)
       END AS T2SUM,
       CASE 
          WHEN T1SUM IS NULL AND T2SUM IS NULL THEN COALESCE(NULLSUM,0)
          ELSE 0
       END AS NULLSUM  
FROM (
   SELECT NAME, 
          SUM(CASE WHEN TYPE = 'T1' THEN VALUE END) AS T1SUM,
          SUM(CASE WHEN TYPE = 'T2' THEN VALUE END) AS T2SUM,
          SUM(CASE WHEN TYPE IS NULL THEN VALUE END) AS NULLSUM
   FROM mytable
   GROUP BY NAME) AS t

Solution 2:[2]

So in Giorgos's answer that totals are given in a pivoted, or single row be case form, not many rows per case, and this can be written simpler:

with this data:

WITH data_table(name, type, value) AS (
    SELECT * FROM VALUES
    (10, 1, 100 ),
    (10, 2, 200 ),
    (10, null, 400 ),
    (11, 2, 100 ),
    (11, null, 200 ),
    (12, null, 100 )
)

and this SQL

SELECT name
    ,SUM(IFF(type=1, value, null)) as t1_val
    ,SUM(IFF(type=2, value, null)) as t2_val
    ,SUM(IFF(type is null, value, null)) as tnull_val
    ,IFF(t1_val is not null, t1_val + zeroifnull(tnull_val), null) as c1_sum
    ,IFF(t1_val is not null, t2_val, t2_val + zeroifnull(tnull_val)) as c2_sum
    ,IFF(t1_val is null AND t2_val is null, tnull_val, null) as c3_sum
FROM data_table
GROUP BY 1;

we get:

NAME T1_VAL T2_VAL TNULL_VAL C1_SUM C2_SUM C3_SUM
10 100 200 400 500 200 null
11 null 100 200 null 300 null
12 null null 100 null null 100

which shows for the 10 row the null sum binds with 1 sum, for the 11 row the null sum binds with the 2 sum, and in the 12 row we get the null sum by itself.

We can unpivot these values if we wish, but joining to a mini table with 3 rows like so:

SELECT d.name,
    p.c2 as type,
    case p.c1
        WHEN 1 then d.c1_sum
        WHEN 2 then d.c2_sum
        ELSE d.c3_sum
    end as value
FROM (
    SELECT name
        ,SUM(IFF(type=1, value, null)) as t1_val
        ,SUM(IFF(type=2, value, null)) as t2_val
        ,SUM(IFF(type is null, value, null)) as tnull_val
        ,IFF(t1_val is not null, t1_val + zeroifnull(tnull_val), null) as c1_sum
        ,IFF(t1_val is not null, t2_val, t2_val + zeroifnull(tnull_val)) as c2_sum
        ,IFF(t1_val is null AND t2_val is null, tnull_val, null) as c3_sum
    FROM data_table
    GROUP BY 1
) AS d
JOIN (
    SELECT column1 as c1, column2 as c2 
    FROM VALUES (1,'T1'),(2,'T2'),(null,'null')
) AS p 
    ON ((d.c1_sum is not null AND p.c1 = 1) 
        OR (d.c2_sum is not null AND p.c1 = 2) 
        OR (d.c3_sum is not null AND p.c1 is null))
ORDER BY 1,2;

which gives the original requested output:

NAME TYPE VALUE
10 T1 500
10 T2 200
11 T2 300
12 null 100

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Simeon Pilgrim