'How to auto increment for the specific condition in PySpark

I have below PySpark dataframe

id      dept
1       CSE
        ISE
        ECE
2       EEE
4       MCE

I am trying to increment the value by checking the condition i.e if id is null then add the value to the new column by incrementing the value by 1 from the defined value (max_value), if id is not null retain the same value.

to achieve above scenario I am using below code.

max_value = 5

df = df.withColumn("idx", monotonically_increasing_id())
w = Window().orderBy("idx")
df = df.withColumn("row_num", F.when(F.col("id").isNull() ,(max_value + row_number().over(w)).otherwise(F.col("id"))))

But i am getting below error

IllegalArgumentException: otherwise() can only be applied on a Column previously generated by when()

Expected output:

id      dept   new_col
1       CSE    1
        ISE    6
        ECE    7
2       EEE    2
4       MCE    4

Can anyone help me to resolve the issue. It will be great



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source