'Replace multiple LEFT JOINS on the same table
I have a SQL query that consists of multiple left joins to the same table and has a very bad performance. The only difference is that each left join searches for a different feature in the table, which is then set as a new column.
SELECT
t0.*,
m1.value AS "Bestellnummer"
m2.value AS "Bestellposition",
m3.value AS "Aussteller",
m4.value AS "Baugruppe",
m5.value AS "Artikelnummer",
m6.value AS "Zertifikatsnummer",
m7.value AS "Arbeitspaket",
m8.value AS "Chargennummer",
m9.value AS "Systemnummer",
m10.value AS "Armaturnummer"
FROM table_1 t0
LEFT JOIN table_2 m1 ON (m1.dokument_id = t0.id AND m1.merkmal = 'Bestellnummer')
LEFT JOIN table_2 m2 ON (m2.dokument_id = t0.id AND m2.merkmal = 'Bestellposition')
LEFT JOIN table_2 m3 ON (m3.dokument_id = t0.id AND m3.merkmal = 'Aussteller')
LEFT JOIN table_2 m4 ON (m4.dokument_id = t0.id AND m4.merkmal = 'Baugruppe')
LEFT JOIN table_2 m5 ON (m5.dokument_id = t0.id AND m5.merkmal = 'Artikelnummer')
LEFT JOIN table_2 m6 ON (m6.dokument_id = t0.id AND m6.merkmal = 'Zertifikatsnummer')
LEFT JOIN table_2 m7 ON (m7.dokument_id = t0.id AND m7.merkmal = 'Arbeitspaket')
LEFT JOIN table_2 m8 ON (m8.dokument_id = t0.id AND m8.merkmal = 'Chargennummer')
LEFT JOIN table_2 m9 ON (m9.dokument_id = t0.id AND m9.merkmal = 'Systemnummer')
LEFT JOIN table_2 m10 ON (m10.dokument_id = t0.id AND m10.merkmal = 'Armaturnummer')
I have already tried to replace all the left joins with the following statement:
SELECT
t0.id,
t0.document_id,
t0.vault_id,
t0.document_type_id,
t0.state,
t0.name,
to_number(to_char(unixts_to_date(t0.file_create_date), 'YYYYMMDD')) AS file_create_date,
to_number(to_char(unixts_to_date(t0.file_change_date), 'YYYYMMDD')) AS file_change_date,
t1."Bestellnummer",
t1."Aussteller",
t1."Baugruppe",
t1."Artikelnummer",
t1."Zertifikatsnummer",
t1."Arbeitspaket",
t1."Chargennummer",
t1."Systemnummer",
t1."Armaturnummer"
FROM table_1 t0
LEFT JOIN
(
SELECT
dokument_id,
max(case when merkmal = 'Bestellnummer' then value else null end ) as "Bestellnummer",
max(case when merkmal = 'Bestellposition' then value else null end ) as "Bestellposition",
max(case when merkmal = 'Aussteller' then value else null end ) as "Aussteller",
max(case when merkmal = 'Baugruppe' then value else null end ) as "Baugruppe",
max(case when merkmal = 'Artikelnummer' then value else null end ) as "Artikelnummer",
max(case when merkmal = 'Zertifikatsnummer' then value else null end ) as "Zertifikatsnummer",
max(case when merkmal = 'Arbeitspaket' then value else null end ) as "Arbeitspaket",
max(case when merkmal = 'Chargennummer' then value else null end ) as "Chargennummer",
max(case when merkmal = 'Systemnummer' then value else null end ) as "Systemnummer",
max(case when merkmal = 'Armaturnummer' then value else null end ) as "Armaturnummer"
FROM
table_2
GROUP BY dokument_id
) t1 on t1.dokument_id = t0.id
But this does not work, since a document of the table tabe_1 for example can have 2 times the characteristic 'article number' with different values.
Does anyone have an idea?
Thank you!
Solution 1:[1]
Ouch. It looks like someone in your organization has unfortunately "discovered" the Entity-Attribute-Value data model, which promises flexibility but is hardly scalable at all. I don't know if there is an easy trick, and you did not provide useful details like an explain plan output.
Your problem is made worse by the fact that you have no WHERE condition on table_1. So, guessing here, you are probably looking at a HASH_JOIN operation using table_1 as the "build table"[1] and then 9 loops using table_2 as the "probe table". Worse still, it's likely that the build table from table_1 does not fit in PGA memory, making the whole process way less efficient.
First, you should have indexes on table_1(id) and table_2(dokument_id, merkmal, value). Including value in the table_2 index can be very helpful in a situation like this, because it means Oracle never has to read the blocks from table_2 -- everything it needs from that table is already in the index (this is called a "covering index"). That index is likely to be smaller than the table.
Second, if that doesn't help, consider restructuring table_1 as an Index-Organized Table. That, along with the covering index on table_2 might make a sort-merge join viable.
No guarantees here. You have a bad data model here for the query you are trying to execute. Clever SQL can only go so far. The real answer is to fix your data model. Move each the attribute value from table_2 into a new column on table_1.
P.S. If you do have a WHERE clause on table_1 and just omitted it from your question for some reason, you should update your question because it changes the answer a lot.
P.P.S. If you are still stuck, update your question with DBMS_XPLAN output with statistics.
Solution 2:[2]
Have you considered making a temp table and then doing updates to it utilizing the conditions in your left joins on each update statement, only as inner joins?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Matthew McPeak |
| Solution 2 | DBANoob |
