'Twitter API with Structured Spark Streaming
I am trying to access the json data from tweets in my kafka topic.In my spark structured streaming while creating schema is it necessary to explicitly specify each and every key from the twitter API.Can i not access the only ones which i want to analyse like the text field alone?
Solution 1:[1]
While recommended, the schema is optional. You should be able to do something like this
kafkaDf
.select(col("value").cast("string").as("value"))
.select(get_json_object(col("value"), "$.text"))
https://spark.apache.org/docs/latest/api/sql/index.html#get_json_object
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
