'Spark convert json data to DataFrame using Scala
Input one.txt file
[{"a":1,"b":2,"c":3}, {"a":11,"b":12,"c":13},{"a":1,"b":2,"c":3}]
Expected out:
a b c
1,11,1 2,12,1 3,13,3
Could you please provide the solution in a Spark dataFrame using scala?
val spark = SparkSession.builder().appName("JSON_Sample").master("local[1]") getOrCreate()
val data = """[{"a":1,"b":2,"c":3}, {"a":11,"b":12,"c":13},{"a":1,"b":2,"c":3}]""" //one.txt
val df = spark.read.text("./src/main/scala/resources/text/one.txt").toDF()
Solution 1:[1]
This is python version of running code with spark. if you are able to convert it then it is fine otherwise let me know i will do it.
df = spark.read.json(sc.parallelize([{"a":1,"b":2,"c":3},{"a":11,"b":12,"c":13},{"a":1,"b":2,"c":3}]))
df.show()
+---+---+---+
| a| b| c|
+---+---+---+
| 1| 2| 3|
| 11| 12| 13|
| 1| 2| 3|
+---+---+---+
df.agg(*[concat_ws(",",collect_list(col(i))).alias(i) for i in df.columns]).show()
+------+------+------+
| a| b| c|
+------+------+------+
|1,11,1|2,12,2|3,13,3|
+------+------+------+
For scala Spark :
import spark.implicits._
val spark = SparkSession.builder().appName("JSON_Sample").master("local[1]") getOrCreate()
val jsonStr = """[{"a":1,"b":2,"c":3}, {"a":11,"b":12,"c":13},{"a":1,"b":2,"c":3}]"""
val df= spark.read.json(spark.createDataset(jsonStr :: Nil))
val exprs = df.columns.map((_ -> "collect_list")).toMap df.agg(exprs).show()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
