Category "apache-kafka"

How do I serialize non-ascii characters in pykafka?

I have a defauldict I want to serialize as json. But some characters then replaced with unicode escaped charasters. For example, if I have string "International

kerberos error while authenticating on Confluent Kafka

I´ve been trying to understand apache beam, confluent kafka and dataflow integration with python 3.8 and beam sdk 2.7 the desire result is to build a pipe

Send and load an ML model over Apache Kafka

I've been looking around here and on the Internet, but it seems that I'm the first one having this question. I'd like to train an ML model (let's say something

Connect to Kafka running in Docker

I setup a single node Kafka Docker container on my local machine like it is described in the Confluent documentation (steps 2-3). In addition, I also exposed Z

Kafka authentication with Jaas config

I have set up my Kafka jaas config as an external bean in my spring boot application to read my configuration from my application.yaml file. But I am facing an

get count of partitions in a kafka topic with scala 2.12

With scala 2.11 and spark-streaming-kafka-0-8_2.11 I could do import org.apache.spark.streaming.kafka.KafkaCluster val params = Map[String, Object]( "bootstr

Found directory not in the form of topic-partition . Kafka's log directories (and children) should only contain Kafka topic data

[2021-04-05 07:51:32,180] ERROR There was an error in one of the threads during logs loading: org.apache.kafka.common.KafkaException: Found directory /var/lib/k

Why kafka write null data to the database?

I am working on a nestjs project. My project gets data from Kafka's topic and writes the data to the database (mysql). If I read hundreds of messages from Kafka

How to test Kafka OnFailure callback with Junit?

I have the following code to send data to Kafka: @Service public class KafkaSender{ @Autowired private KafkaTemplate<String, Employee> kafkaTempla

Spark + Read kafka topic from a specific offset based on timestamp

How do I set a spark job to pick up a kafka topic from a specific offset based on a timestamp ? Let's say that I need to get all data from a kafka topic startin

Apache Kafka - Implementing a KTable

I am new to Kafka Streams API and I am trying to create a KTable. I have an input topic: s-order-topic, which is a json format message, as shown below. { "curr

Is it possible to query a KSQL Table/Materialized view via HTTP?

I have a materialized view created using CREATE TABLE average_latency AS SELECT DEVICENAME, AVG(LATENCY) AS AVG_LATENCY FROM metrics WINDOW TUMBLING (SIZE 1 MIN

kafka consumer is not listening from command line

I am using kafka 1.0.0V In my project. From yesterday on wards. I am unable to listen Messages from command line . In the same time I am able to listen the mess

How to do stream processing with Redpanda?

Redpanda seems easy to work with, but how would one process streams in real-time? We have a few thousand IoT devices that send us data every second. We would li

What is the difference between kafka earliest and latest offset values

producer sends messages 1, 2, 3, 4 consumer receives messages 1, 2, 3, 4 consumer crashes/disconnects producer sends messages 5, 6, 7 consumer comes back up

How to drain the window after a Flink join using coGroup()?

I'd like to join data coming in from two Kafka topics ("left" and "right"). Matching records are to be joined using an ID, but if a "left" or a "right" record i

Table-Table Join duplicate entries

we are using kafka in production and I try to push the adoption and usage of KSQL in the same direction. But I already failed with one simple table-table join.

Kafka consumer unit test with Avro Schema registry failing

I'm writing a consumer which listens to a Kafka topic and consumes message whenever message is available. I've tested the logic/code by running Kafka locally an

Clickhouse Kafka engine on cluster

I'm playing with Kafka engine on ClickHouse cluster. At the moment ClickHouse 22.1 cluster and Kafka are run in Docker. Here are configurations: https://github.

Can Kafka Connect consume data from a separate kerberized Kafka instance and then route to Splunk?

My pipeline is: Kerberized Kafka --> Logstash (hosted on a different server) --> Splunk. Can I replace the Logstash component with Kafka Connect? Could