'How do I explode String in Spark dataframe

I have a JSON string which is actually an array

|{"[0].id":"cccccccc","[0].label":"xxxxxx","[0].deviceTypeId":"xxxxxxxxxxxx"}|

I need to explode this so that I can have all keys as columns, something like this

dataFrame.
  .withColumn("single", explode_outer(col("nested")))

However, spark keeps complaining that explode should be map an array.

How do I do this?



Solution 1:[1]

You can parse the JSON string into MapType using from_json, then explode the map and pivot:

val df = Seq(
  (1,"""{"[0].id":"cccccccc","[0].label":"xxxxxx","[0].deviceTypeId":"xxxxxxxxxxxx"}""")
).toDF("id", "nested")

val df1 = (df
  .select(
    col("id"),
    explode(from_json(col("nested"), lit("map<string,string>")))
  )
  .groupBy("id")
  .pivot("key")
  .agg(first(col("value"))))

df1.show
//+---+----------------+--------+---------+
//| id|[0].deviceTypeId|  [0].id|[0].label|
//+---+----------------+--------+---------+
//|  1|    xxxxxxxxxxxx|cccccccc|   xxxxxx|
//+---+----------------+--------+---------+

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 blackbishop