'How to look if a value is in a group of columns in Pyspark

I've a big dataframe with multiple columns, some of those columns are called col1:col35. I'd like to know if a specif value is contained in ANY of those columns.

If 'RP' is in (col1:col35) then val= 1; else val = 0;

This is the code I was using on pandas but I'd like to migrate my code to pyspark.

df['exclude'] = functools.reduce(np.logical_or, [df['col{}_cd'.format(i)].str.contains('RP', na= False) for i in range(1,36)])

I've tried the same code on pyspark but I'm getting the following mistake:

df1['exclude'] = reduce(np.logical_or, [df1['col{}_cd'.format(i)].str.contains('RP', na= False) for i in range(1,36)])

TypeError: _() got an unexpected keyword argument 'na'

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'How to look if a value is in a group of columns in Pyspark

Sources

Related Questions