'How to save empty pyspark dataframe with header into csv file?

Hi I have dataframe which is having only columns. There is no data for columns. But I am trying to save into file, no header is saving. File is totally blank.

Example:

df.show()

+-----+----------------------+-------+---------------------+------------------------+----------------------------+--------------------------+----------------------+---------------+------------------------+-------------+-----------------+-----------------------+--------------+---------------+-----------+-----------------+-----------+------+--------+----------------+----------------------+--------------+-----+-------+---------+------+--------+
|owner|account_priority_score|account|call_objective_clm_id|call_objective_from_date|call_objective_on_by_default|call_objective_record_type|call_objective_to_date|display_dismiss|display_mark_as_complete|display_score|email_template_id|email_template_vault_id|email_template|expiration_date|no_homepage|planned_call_date|posted_date|reason|priority|record_type_name|suggestion_external_id|supress_reason|title|product|survey_id|groups|insrt_dt|
+-----+----------------------+-------+---------------------+------------------------+----------------------------+--------------------------+----------------------+---------------+------------------------+-------------+-----------------+-----------------------+--------------+---------------+-----------+-----------------+-----------+------+--------+----------------+----------------------+--------------+-----+-------+---------+------+--------+
+-----+----------------------+-------+---------------------+------------------------+----------------------------+--------------------------+----------------------+---------------+------------------------+-------------+-----------------+-----------------------+--------------+---------------+-----------+-----------------+-----------+------+--------+----------------+----------------------+--------------+-----+-------+---------+------+--------+

But while saving into file headers are not coming. I am using below code-

df.coalesce(1).write.mode('overwrite').csv(output_path, sep=output_delimiter,quote='',escape='\"', header='True', nullValue=None)


Solution 1:[1]

To do what you are asking you will have to define a schema.

So for example:

schema = StructType([ \
    StructField("firstname",StringType(),True), \
    StructField("middlename",StringType(),True), \
    StructField("lastname",StringType(),True), \
    StructField("id", StringType(), True), \
    StructField("gender", StringType(), True), \
    StructField("salary", IntegerType(), True) \
  ])

df = spark.createDataFrame([],schema=schema)
df.coalesce(1).write.csv("/tmp/csv_data/", header=True)

this will output single csv file with just the headers.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Benny Elgazar