'Deserialize Kafka Thrift message with PySpark Streaming to JSON
I am using databricks environment and reading a Kafka input.
The messages consumed from Kafka are in Thrift Binary and am having issues deserialize it to JSON.
I am writing an UDF to do this, but am unable to figure out how to convert thrift -> JSON?
I have tried
import thriftpy2.protocol.json as proto
def decoder(thrift_data):
return proto.struct_to_json(thrift_data)
which throws an error PythonException: 'AttributeError: 'bytearray' object has no attribute 'thrift_spec''
and also tried
def decoder(thrift_data):
return serialize(thrift_data, protocol_factory=TSimpleJSONProtocolFactory())
which throws an error PythonException: 'AttributeError: 'bytearray' object has no attribute 'write'',
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
