'Generate UUID which is same for a group of columns in SQL

Can someone pls suggest a way to write SQL query which would generate a unique UUID which would be same for a group of columns in amazon athena.

For example, i have a table like this, where i want to create a UUID for columns1, columns2 and columns3.

column1 | column2 | column3 | column 4
2016    | 101     | 1       | 25
2016    | 101     | 1       | 59
2017    | 105     | 2       | 57
2017    | 105     | 2       | 78

Output 1 must look like

ID      | column1 | column2 | column3 | column 4
UUID-1  | 2016    | 101     | 1       | 25
UUID-1  | 2016    | 101     | 1       | 59
UUID-2  | 2017    | 105     | 2       | 57
UUID-2  | 2017    | 105     | 2       | 78

Output 2: should look like

ID      | count |column1 | column2 | column3
UUID-1  | 2     |2016    | 101     | 1
UUID-2  | 2     |2017    | 105     | 2

I understand that grouping can be done on output 1 to generate output 2. Can someone suggest, how i can generate output 1 ?

Thanks.



Solution 1:[1]

You can try to use uuid() function.

SELECT uuid() id,
       COUNT(*),
       column1 ,
       column2 ,
       column3
FROM T
GROUP BY column1 ,
       column2 ,
       column3

EDIT

I saw you edit your question, you can try to use subquery and self join get output1

SELECT t1.*,t2.column4
FROM (
    SELECT DISTINCT uuid() id,
           column1 ,
           column2 ,
           column3
    FROM T
) t1 INNER JOIN T t2 
ON t1.column1 = t2.column1
AND t1.column2 = t2.column2
AND t1.column3 = t2.column3

another way you can try to use max window function to get only one GUID per column1,column2,column3 columns.

select max(id) over (partition by column1,column2,column3) as id, 
       column1,
       column2,
       column3,
       column4
from 
(
    SELECT uuid() id,*
    FROM T
) t1

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1