'pyspark sql with inline case when join condition

This is the original SQL with inline case when condition:

select * 
from table_a
LEFT JOIN table_b
ON case when table_a.key not in ('1','2') then '0' else table_a.key end = table_b.key

What is the equivalent pyspark code?

I was trying when().otherwise() function and if() function but neither worked.

table_a=spark.sql('''select 1 as key union select 2 as key''')

table_a.show()

table_b=spark.sql('''select 3 as key union select 0 as key''')

table_b.show()

val join_condition = when(((table_a.key == '1') | (table_a.key == '2')), table_a.key == table_b.key).otherwise(('0' == table_b.key))

df = table_a.join(table_b, join_condition, 'leftouter').select(table_a['*'], table_b['*'])

df = table_a.join(table_b, ((if(((table_a.key == '1') | (table_a.key == '2')), ('0'), (table_a.key))) == table_b.key), 'leftouter')\
.select(table_a['*'], table_b['*'])

Thank you!



Solution 1:[1]

Whenever you have a multi conditions like this (table_a.key == '1' | table_a.key == '2'), you'd have to wrap each of them separately like this ((table_a.key == '1') | (table_a.key == '2'))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 pltc