'Pyspark: How to chain Column.when() using a dictionary with reduce?

I'm trying to get conditions from a dictionary in a chain of when() functions using reduce() to pass in the end to a dataframe.withColumn().

for example:

conditions = {
    "0": (col("a") == 1.0) & (col("b") != 1.0),
    "1": (col("c") == 1.0) & (col("d") == 1.0)
}

using reduce() I implemented this:

when_stats = reduce(lambda key, value: when(conditions[key], lit(key)), conditions)

and later using it in withColumn():

df2 = df1.withColumn(result, when_stats)

The problem is that it only takes the first condition which is "0" and doesn't chain the second one. printing 'when_stats' gives me:

Column<'CASE WHEN ((a = 1.0) AND (NOT (b = 1.0))) THEN 0 END'>

When I add a 3rd condition it throws an error and doesn't work:

TypeError: unhashable type: 'Column'

So the question is, how can I loop through the dictionary and create the complete when().when().when()... ? Is there a better solution specially if I want to have otherwise() in the end?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source