'AWS MSK Connector - Kafka topic to s3
I am using AWS MSK.
I want to have a connector that will take messages from a Kafka topic and send them to S3 bucket.
The Kafka topic contains protobuf messages. The messages are from different types and can be changed during runtime.I have a Kafka header the holds the type (Ex.): {"type":"Event97"}
The generated files in the bucket should be in parquet format.
The generated file in the bucket should contain the event type as prefix. (Ex.): Event97/2022/02/19/23/<random_value_goes_here>.parquet
I was looking at Confluent connector and noticed I have the following challenges:
I will have to implement a Partitioner since I cant find a ready to use partitioner that can create the file name I need (Event97/..)
Since AWS Schema registry does not support protobuf, I dont know how to convert the proto byte array into parquet.
I can solve #1 but I have no idea how to overcome #2
Any ideas?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
