'Parsing a JSON-escaped (/) string in PySpark vs. Scala
I need to convert this Scala code to PySpark (below is the scala code)? How can this be accomplished in PySpark?
val sc: SparkContext = ...
val sqlContext = new SQLContext(sc)
val escapedJsons: RDD[String] = sc.parallelize(Seq("""{"id":1,"name":"some name","problem_field":"{\"height\":180,\"weight\":80}"}"""))
val unescapedJsons: RDD[String] = escapedJsons.map(_.replace("\"{", "{").replace("\"}", "}").replace("\\\"", "\""))
val dfJsons: DataFrame = sqlContext.read.json(unescapedJsons)
dfJsons.printSchema()
Output
root
|-- id: long (nullable = true)
|-- name: string (nullable = true)
|-- problem_field: struct (nullable = true)
| |-- height: long (nullable = true)
| |-- weight: long (nullable = true)
Big Thank you :)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
