'Rename pyspark dataframe columns based on when condition

I'm using pyspark and have a spark dataframe created as:

df = spark.createDataFrame([(1, None),
                            (2, 3),
                            (4, None)],
                            ["A", "B"])
df.show()

+---+----+
|  A|   B|
+---+----+
|  1|null|
|  2|   3|
|  4|null|
+---+----+

I'd like to rename the columns based on the number of missing values in each column, without calling .collect() or .first(). I'm flagging which columns have below a certain number of missing values by doing:

import pyspark.sql.functions as F

missing = df.select([F.when(F.count(c) < 2, 1).otherwise(0).alias(c) for c in df.columns])

missing.show()

+---+---+
|  A|  B|
+---+---+
|  0|  1|
+---+---+

However, what I'd like to output is:

+---+---+------+
|  A|  delete_B|
+---+---+------+
|  0|    1     |
+---+---+------+

How can I edit the dataframe column names to append "delete" to a column if it has more than a certain count of nulls?

Thanks a lot.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source