'Pyspark to get the column names of the columns that contains null values

I've a DataFrame where I want to get the column names of the columns that contains one or more null values in them.

So far what I've done :

df.select([c for c in tbl_columns_list if df.filter(F.col(c).isNull()).count() > 0]).columns

I have almost 500 columns in my dataframe and when I execute that code, it becomes incredibly slow for a reason I don't know. Do you have any clue how can I make it work and how can I optimize that please? I need optimized solution in Pyspark please. Thanks in advance.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source