'How to save empty pyspark dataframe with header into csv file?
Hi I have dataframe which is having only columns. There is no data for columns. But I am trying to save into file, no header is saving. File is totally blank.
Example:
df.show()
+-----+----------------------+-------+---------------------+------------------------+----------------------------+--------------------------+----------------------+---------------+------------------------+-------------+-----------------+-----------------------+--------------+---------------+-----------+-----------------+-----------+------+--------+----------------+----------------------+--------------+-----+-------+---------+------+--------+
|owner|account_priority_score|account|call_objective_clm_id|call_objective_from_date|call_objective_on_by_default|call_objective_record_type|call_objective_to_date|display_dismiss|display_mark_as_complete|display_score|email_template_id|email_template_vault_id|email_template|expiration_date|no_homepage|planned_call_date|posted_date|reason|priority|record_type_name|suggestion_external_id|supress_reason|title|product|survey_id|groups|insrt_dt|
+-----+----------------------+-------+---------------------+------------------------+----------------------------+--------------------------+----------------------+---------------+------------------------+-------------+-----------------+-----------------------+--------------+---------------+-----------+-----------------+-----------+------+--------+----------------+----------------------+--------------+-----+-------+---------+------+--------+
+-----+----------------------+-------+---------------------+------------------------+----------------------------+--------------------------+----------------------+---------------+------------------------+-------------+-----------------+-----------------------+--------------+---------------+-----------+-----------------+-----------+------+--------+----------------+----------------------+--------------+-----+-------+---------+------+--------+
But while saving into file headers are not coming. I am using below code-
df.coalesce(1).write.mode('overwrite').csv(output_path, sep=output_delimiter,quote='',escape='\"', header='True', nullValue=None)
Solution 1:[1]
To do what you are asking you will have to define a schema.
So for example:
schema = StructType([ \
StructField("firstname",StringType(),True), \
StructField("middlename",StringType(),True), \
StructField("lastname",StringType(),True), \
StructField("id", StringType(), True), \
StructField("gender", StringType(), True), \
StructField("salary", IntegerType(), True) \
])
df = spark.createDataFrame([],schema=schema)
df.coalesce(1).write.csv("/tmp/csv_data/", header=True)
this will output single csv file with just the headers.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Benny Elgazar |
