'Adding 10+ headers to a Pyspark Dataframe

I have a csv file that does not have headers, and it consists of 49 columns. I was given a separate csv file with columns' description and column name. Instead of adding StructField 49 times (like StructField("srcip",StringType(),True)), is there another way to do it? Like a function?

Thank you.

Solution 1:^[1]

Assuming you have a list of column names (by reading from csv etc), you can loop through it and create a proper schema

cols = ['a', 'b', 'c']

schema = T.StructType([T.StructField(c, T.StringType()) for c in cols])
# StructType(List(StructField(a,StringType,true),StructField(b,StringType,true),StructField(c,StringType,true)))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	pltc

'Adding 10+ headers to a Pyspark Dataframe

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]