'Pyspark: How to chain Column.when() using a dictionary with reduce?
I'm trying to get conditions from a dictionary in a chain of when() functions using reduce() to pass in the end to a dataframe.withColumn().
for example:
conditions = {
"0": (col("a") == 1.0) & (col("b") != 1.0),
"1": (col("c") == 1.0) & (col("d") == 1.0)
}
using reduce() I implemented this:
when_stats = reduce(lambda key, value: when(conditions[key], lit(key)), conditions)
and later using it in withColumn():
df2 = df1.withColumn(result, when_stats)
The problem is that it only takes the first condition which is "0" and doesn't chain the second one. printing 'when_stats' gives me:
Column<'CASE WHEN ((a = 1.0) AND (NOT (b = 1.0))) THEN 0 END'>
When I add a 3rd condition it throws an error and doesn't work:
TypeError: unhashable type: 'Column'
So the question is, how can I loop through the dictionary and create the complete when().when().when()... ? Is there a better solution specially if I want to have otherwise() in the end?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
