'How do I explode String in Spark dataframe
I have a JSON string which is actually an array
|{"[0].id":"cccccccc","[0].label":"xxxxxx","[0].deviceTypeId":"xxxxxxxxxxxx"}|
I need to explode this so that I can have all keys as columns, something like this
dataFrame.
.withColumn("single", explode_outer(col("nested")))
However, spark keeps complaining that explode should be map an array.
How do I do this?
Solution 1:[1]
You can parse the JSON string into MapType using from_json, then explode the map and pivot:
val df = Seq(
(1,"""{"[0].id":"cccccccc","[0].label":"xxxxxx","[0].deviceTypeId":"xxxxxxxxxxxx"}""")
).toDF("id", "nested")
val df1 = (df
.select(
col("id"),
explode(from_json(col("nested"), lit("map<string,string>")))
)
.groupBy("id")
.pivot("key")
.agg(first(col("value"))))
df1.show
//+---+----------------+--------+---------+
//| id|[0].deviceTypeId| [0].id|[0].label|
//+---+----------------+--------+---------+
//| 1| xxxxxxxxxxxx|cccccccc| xxxxxx|
//+---+----------------+--------+---------+
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | blackbishop |
