'AWS MSK Connector - Kafka topic to s3

I am using AWS MSK.

I want to have a connector that will take messages from a Kafka topic and send them to S3 bucket.

The Kafka topic contains protobuf messages. The messages are from different types and can be changed during runtime.I have a Kafka header the holds the type (Ex.): {"type":"Event97"}

The generated files in the bucket should be in parquet format.

The generated file in the bucket should contain the event type as prefix. (Ex.): Event97/2022/02/19/23/<random_value_goes_here>.parquet

I was looking at Confluent connector and noticed I have the following challenges:

  1. I will have to implement a Partitioner since I cant find a ready to use partitioner that can create the file name I need (Event97/..)

  2. Since AWS Schema registry does not support protobuf, I dont know how to convert the proto byte array into parquet.

I can solve #1 but I have no idea how to overcome #2

Any ideas?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source