'Pull columns from series of joins in SQL

I am kind of stuck at one problem at my job where I need to pull 2 cols from base table and 1 column from a series of joins.

Please note that, I can not provide real data so I am using dummy column/table names and there are 100s of columns in real project.

Select A.Name,B.Age, D.Sal 
From A Left join B on A.iD=B.id and B.Date=CURRENT_DATE
(/* join A and B table and return distinct column which is B.XYZ) 
inner join C on C.iD=B.XYZ  
(/* join B and C take C.YYY column for next join */)
inner join D on D.id=C.YYY  
(/* Take out the D.Sal column from this join */) where A.Dept='IT'

I have written this query but it is taking forever to run because B.XYZ column has a lot of duplicates. how can I get distinct of B.XYZ column from that join.



Solution 1:[1]

For Joining Table B, you first get a distinct table of the columns you need from B then join.

SELECT 
  A.Name,
  B.Age, 
  D.Sal 
From A 
LEFT JOIN ( -- Instead of all cols (*), just id, Date, Age and xyz might do
  SELECT DISTINCT * FROM B 
) B ON A.iD = B.id AND B.Date = CURRENT_DATE
    --(/* join A and B table and return distinct column which is B.XYZ */) 
INNER JOIN C ON C.iD = B.XYZ  
    --(/*join B and C take C.YYY column for next join */)
INNER JOIN D ON D.id = C.YYY  
    --(/* Take out the D.Sal column from this join */)
WHERE A.Dept='IT'

Solution 2:[2]

You say you get the same rows multifold, because for a b.id, date and age you get the same xyz more than once, or so I understand it.

One option is to join with a subquery that gets the distinct data:

SELECT a.name, b.age, d.sal 
FROM a 
LEFT JOIN 
(
  SELECT DISTINCT id, date, age, xyz FROM b
) dist_b ON dist_b.id = a.id and dist_b.date = CURRENT_DATE
INNER JOIN c ON c.id = dist_b.xyz  
INNER JOIN d ON d.id = c.yyy  
WHERE a.dept = 'IT';

Of course you can even move the date condition inside the subquery:

SELECT a.name, b.age, d.sal 
FROM a 
LEFT JOIN 
(
  SELECT DISTINCT id, age, xyz FROM b WHERE date = CURRENT_DATE
) dist_b ON dist_b.id = a.id
INNER JOIN c ON c.id = dist_b.xyz  
INNER JOIN d ON d.id = c.yyy  
WHERE a.dept = 'IT';

Your LEFT OUTER JOIN doesn't work by the way. As you are inner joining the following tables, a match must exists, so your outer join becomes an inner join. For the outer join to work you would have to outer join the following tables, too.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 nazim
Solution 2