'Spark Structured Streaming with State (Pyspark)
I want to match data with spark streaming based on a certain condition and I want to write this data to Kafka. By keeping the unmatched under a state and this state will keep a maximum of 2 days of data in hdfs. Each new incoming data will try to match the unmatched data in this state. How can I use this state event? (I'm using pyspark)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
