'Pyspark how to create a customized csv from data frame
i have below data frame which i need to load in to csv with customized row and values
common_df.show()
+--------+----------+-----+----+-----+-----------+-------+---+
|name |department|state|id |name | department| state | id|
+--------+----------+-----+----+-----+-----------+-------+---+
|James |Sales |NY |101 |James| Sales1 |null |101|
|Maria |Finance |CA |102 |Maria| Finance | |102|
|Jen |Marketing |NY |103 |Jen | |NY2 |103|
i am following below approach currently to convert df to csv
pandasdf=common_df.toPandas()
pandasdf.to_csv("s3://mylocation/result.csv")
The above going to convert with same structure in csv. however i need to structure from above format to something below, I think the solution would be to split each row to two allocating the id on left within data frame. but i don't see any example or solution directly from spark
|name |dept |state|id |
------------------------------------
101 |James |Sales |NY |101 |
|James |null |NY |101 |
------------------------------------
102 |Maria |Finance | |102 |
|Maria |Finance |CA |102 |
-------------------------------------
103 |Jen |Marketing |NY |103 |
|Jen | |NY2 |103 |
------------------------------------
Any solution to this?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
