'LEFT JOIN with an OR in the ON clause BigQuery Standard SQL

I need some help understanding joins in bigquery standard sql. I want to do a left join keeping all the columns in table1, and joining to table2 if 2 fields match OR a different 2 fields match. This should be better explained in my example.

Table1:

id1 id2 column1  
1   a   first    
2   b   second   
3   c   third    
4   d   fourth   
5   e   fifth    
6   f   sixth

Table 2:

id3  id4    column2 
1    5674   alpha    
2    4535   bravo 
345  a      charlie  
341  b      delta

I want to keep all the rows in table1 and all the rows from table2 if id1 = id3 OR id2 = id4, the resulting table would look like this:

Result:

id1 id2 column1 id3 id4     column2  
1   a   first   1   5674    alpha    
1   a   first   345 a       charlie  
2   b   second  2   4535    bravo    
2   b   second  341 b       delta
3   c   third    
4   d   fourth   
5   e   fifth    
6   f   sixth

Although I can't get this result as it seems I can't do a left join with an OR statement in the ON clause.

I have tried this query:

SELECT * FROM table1 
JOIN table2
on (table1.id1 = table2.id3) or (table1.id2 = table2.id4)

Which is an inner join and results in:

id1 id2 column1 id3 id4     column2  
1   a   first   1   5674    alpha    
1   a   first   345 a       charlie  
2   b   second  2   4535    bravo    
2   b   second  341 b       delta

Which is almost there but does not include the other rows from table1

Trying this query:

SELECT * FROM table1 
LEFT JOIN table2
on (table1.id1 = table2.id3) or (table1.id2 = table2.id4)

Results in the error:

LEFT OUTER JOIN cannot be used without a condition that is an equality of fields from both sides of the join.

I understand that there may be a workaround which I would love to hear, but also it would be good to understand why the same method for the inner join is not applicable to the left join.

Thanks in advance, Ryan

EDIT I am still struggling to get my head around why this is not possible, Mikhail's workaround is fine quite heavy on resources and I'd like to understand what is preventing me from using the conditional left join.



Solution 1:[1]

Below is for BigQuery Standard SQL

one of the options:

#standardSQL
SELECT DISTINCT * FROM (
  SELECT * FROM `project.dataset.table1` 
  LEFT JOIN `project.dataset.table2` 
  ON id1 = id3   
  UNION ALL
  SELECT * FROM `project.dataset.table1` 
  LEFT JOIN `project.dataset.table2` 
  ON id2 = id4 
)

Solution 2:[2]

ORACLE SQL:

SELECT * FROM table1 
LEFT JOIN table2
on (table1.id1 = table2.id3) or (table1.id2 = table2.id4)

The above ORACLE SQL can be re-written in BIGQUERY. As both queries will fetch the same results.

BIGQUERY SQL:

#standardSQL
SELECT * FROM (
  SELECT * FROM `project.dataset.table1` 
  LEFT JOIN `project.dataset.table2` 
  ON id1 = id3   
  UNION ALL
  SELECT * FROM `project.dataset.table1` 
  LEFT JOIN `project.dataset.table2` 
  ON id2 = id4 
)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Thomas
Solution 2