'How do I identify distinct combinations across array-columns and then unnest in sql presto
I have a database called programs created as
CREATE TABLE programs (
name varchar(200) NOT NULL,
role varchar(200) NOT NULL,
section text[] NOT NULL,
sub_section text[] NOT NULL,
title text[] NOT NULL
);
INSERT INTO programs (name, role, section, sub_section, title) VALUES
('John','Lead','{"VII","VII","VII"}','{"A","A","C"}','{"STUDY","STUDY","STUDY"}'),
('Olga','Member','{"VII","VII"}','{"A","A"}','{"STUDY","STUDY"}'),
('Ben','Co-Lead','{"XI","X"}','{"A","B"}','{"STUDY","TRAVEL"}'),
('Ana','Member','{"VII","II","VI"}','{"A","ALL","B"}','{"STUDY","STUDY","TRAVEL"}');
Here's what the table looks like
| name | role | section | sub_section | title |
| ---- | ------- | ------------ | ----------- | ------------------------ |
| John | Lead | VII,VII,VII | A,A,C | STUDY,STUDY,STUDY |
| Olga | Member | VII,VII | A,A | STUDY,STUDY |
| Ben | Co-Lead | XI,X | A,B | STUDY,TRAVEL |
| Ana | Member | VII,II,VI | A,ALL,B | STUDY,STUDY,TRAVEL |
I want to identify distinct combinations across the section, sub-section, and title columns, as well as unnesting to get this as output
| name | role | section.sub_section | title |
| ---- | ------- | ------------------- | ------------------------ |
| John | Lead | VII.A | STUDY
| John | Lead | VII.C | STUDY
| Olga | Member | VII.A | STUDY
| Ben | Co-Lead | XI.A | STUDY
| Ben | Co-Lead | X.B | TRAVEL
| Ana | Member | VII.A | STUDY
| Ana | Member | II.ALL | STUDY
| Ana | Member | VI.B | TRAVEL
I'm fairly new to SQL and I'm really struggling with getting desired output. Your help would be very much appreciated.
Solution 1:[1]
You desired data does not show "combinations across the section, sub-section, and title columns", it seems that you require to match corresponding array based on positions, so you can just unnest and group by fields which you want to distinct on.
Assuming that corresponding columns contain arrays of varchars (if not - you will need to use some string functions to convert them):
-- sample data
WITH dataset (name, role, section, sub_section, title) AS (
VALUES ('John','Lead',array['VII','VII','VII'],array['A','A','C'],array['STUDY','STUDY','STUDY']),
('Olga','Member',array['VII','VII'],array['A','A'],array['STUDY','STUDY']),
('Ben','Co-Lead',array['XI','X'],array['A','B'],array['STUDY','TRAVEL']),
('Ana','Member',array['VII','II','VI'],array['A','ALL','B'],array['STUDY','STUDY','TRAVEL'])
)
--query
select name,
role,
sec || '.' || sub_sec "section.sub_section",
t title
from dataset
cross join unnest(section, sub_section, title) as t(sec, sub_sec, t)
group by name, role, sec, sub_sec, t
order by name
Output:
| name | role | section.sub_section | title |
|---|---|---|---|
| Ana | Member | VII.A | STUDY |
| Ana | Member | II.ALL | STUDY |
| Ana | Member | VI.B | TRAVEL |
| Ben | Co-Lead | XI.A | STUDY |
| Ben | Co-Lead | X.B | TRAVEL |
| John | Lead | VII.A | STUDY |
| John | Lead | VII.C | STUDY |
| Olga | Member | VII.A | STUDY |
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Guru Stron |
