'When condition in Pyspark with an equal column
I'm trying to use 2 similar conditions with the like operator as shown in the code below, but all cases are falling on the first one and coming back 3, there's no case coming back with 7, but when I validate the source file they exist.
Dataframe:
| id | Type | AdditionalData |
|---|---|---|
| 1 | OnSiteMessage | %7B%22pluginType%22:5,%22Device%22:1%7D |
| 2 | OnSiteMessage | %7B%22pluginType%22:5,%22Device%22:2%7D |
| 3 | OnSiteMessage | %7B%22pluginType%22:5,%22Device%22:1%7D |
| 4 | Lightbox | %7B%22pluginType%22:5,%22Device%22:1%7D |
| 5 | None | %7B%22pluginType%22:5,%22Device%22:1%7D |
| 6 | PushOptin | %7B%22pluginType%22:5,%22Device%22:1%7D |
| 7 | OnSiteMessage | %7B%22pluginType%22:5,%22Device%22:1%7D |
| 8 | OnSiteMessage | %7B%22pluginType%22:5,%22Device%22:2%7D |
| 9 | OnSiteMessage | %7B%22pluginType%22:5,%22Device%22:2%7D |
Expected dataframe:
| id | Type | AdditionalData | sk_channel |
|---|---|---|---|
| 1 | OnSiteMessage | %7B%22pluginType%22:5,%22Device%22:1%7D | 3 |
| 2 | OnSiteMessage | %7B%22pluginType%22:5,%22Device%22:2%7D | 7 |
| 3 | OnSiteMessage | %7B%22pluginType%22:5,%22Device%22:1%7D | 3 |
| 4 | Lightbox | %7B%22pluginType%22:5,%22Device%22:1%7D | 4 |
| 5 | None | %7B%22pluginType%22:5,%22Device%22:1%7D | None |
| 6 | PushOptin | %7B%22pluginType%22:5,%22Device%22:1%7D | 8 |
| 7 | OnSiteMessage | %7B%22pluginType%22:5,%22Device%22:1%7D | 3 |
| 8 | OnSiteMessage | %7B%22pluginType%22:5,%22Device%22:2%7D | 7 |
| 9 | OnSiteMessage | %7B%22pluginType%22:5,%22Device%22:2%7D | 7 |
This is the code I'm using:
df_plugin_type = df_plugin.withColumn('sk_channel',
when(lower(col('Type')).contains('onsitemessage') & lower(col('AdditionalData')).like("%22:1%"), lit(3))
.when(lower(col('Type')).contains('onsitemessage') & lower(col('AdditionalData')).like("%22:2%"), lit(7))
.when(lower(col('Type')).contains('lightbox'), lit(4))
.when(lower(col('Type')).contains('pushoptin'), lit(8))) \
.filter(col('sk_channel').isNotNull())
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
