'Pyspark: Write CSV from JSON file with struct column

I'm reading a .json file that contains the structure below, and I need to generate a csv with this data in column form, I know that I can't directly write an array-type object in a csv, I used the explode function to remove the fields I need , being able to leave them in a columnar form, but when writing the data frame in csv, I'm getting an error when using the explode function, from what I understand it's not possible to do this with two variables in the same select, can someone help me with something alternative?

from pyspark.sql.functions import col, explode
from pyspark.sql import SparkSession

spark = (SparkSession.builder
    .master("local[1]")
    .appName("sample")
    .getOrCreate())

df = (spark.read.option("multiline", "true")
    .json("data/origin/crops.json"))

df2 = (explode('history').alias('history'), explode('trial').alias('trial'))
.select('history.started_at', 'history.finished_at', col('id'), trial.is_trial, trial.ws10_max))

(df2.write.format('com.databricks.spark.csv')
.mode('overwrite')
.option("header","true")
.save('data/output/'))

root
 |-- history: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- finished_at: string (nullable = true)
 |    |    |-- started_at: string (nullable = true)
 |-- id: long (nullable = true)
 |-- trial: struct (nullable = true)
 |    |-- is_trial: boolean (nullable = true)
 |    |-- ws10_max: double (nullable = true)

I'm trying to return something like this

started_at	finished_at	is_trial	ws10_max
First	row	row
Second	row	row

Thank you!

Solution 1:^[1]

Use explode on array and select("struct.*") on struct.

df.select("trial", "id", explode('history').alias('history')),
  .select('id', 'history.*', 'trial.*'))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Gohmz

'Pyspark: Write CSV from JSON file with struct column

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]