Category "apache-kafka"

Create an open-telemetry span using trace-id and span-id in Java

I'm using open-telemetry to trace my applications and have a few microservices and Kafka broker in my distributed system. I'm using Java/spring-boot and in the

How to load multiple postgresql tables into multiple kafka topics in google cloud environment?

load multiple postgresql tables into multiple kafka topics in google cloud environment using pubsub or kafka connect.

pyspark.sql.utils.AnalysisException: Failed to find data source: kafka

I am trying to read a stream from kafka using pyspark. I am using spark version 3.0.0-preview2 and spark-streaming-kafka-0-10_2.12 Before this I just stat zoo

How to filter using ksql with array attribute type

I have a Stream on a topic with schema: --root --name: string --age: integer --accounts: Array --email I would like to select all root elements hav

How to filter using ksql with array attribute type

I have a Stream on a topic with schema: --root --name: string --age: integer --accounts: Array --email I would like to select all root elements hav

How to skip kafka history data in flink job if certain lag is encountered?

Sometimes we encounter lag in kafka consumer due to some external issues. Flink job will always consume kafka history (delayed data) with exactly-once semantics

How to skip kafka history data in flink job if certain lag is encountered?

Sometimes we encounter lag in kafka consumer due to some external issues. Flink job will always consume kafka history (delayed data) with exactly-once semantics

How to stop listening in a Spring Kafka consumer?

I use Spring for Apache Kafka. I'd like to stop listening to my topic and wait to escape OOM. How can I do it?

Can't establish SSL connection to Kafka after upgrading to python 3.7

Code I have that successfully connects to Kafka with an SSL connection in Python 3.6.7 fails when using Python 3.7.3, with error message SSL: WRONG_VERSION_NUMB

Kafka - broker partitions not in-sync after restart

We use 3 node kafka clusters running 2.7.0 with quite high number of topics and partitions. Almost all the topics have only 1 partition and replication factor o

Kafka MirorMaker2 ports for inter cluster communication

I have setup Apache MirrorMaker 3.0.0 with active-active strategy for two Kafka clusters (named DC, DR). So topic on DC is replicated by MirrorMaker2 as DC.<

Connection between kafka and spark : Failed to find data source : kafka

I am trying to do link between kafka and spark by reading data from one topic and tryy to print the content of this topic into a DataFrame, but by doing connect

Unable to connect to kafka in docker from Spring boot in host machine

I have already gone though Robin Moffat's blog, and several SO posts on the same subject, but still my configuration doesnt work. My docker-compose.yml: kafka

How to start a process (or Kafka to be specific) on remote host with Ansible Playbook

How to start Zookeeper and Kafka broker on remote target with Ansible Playbook. Following commands work fine locally. Start Zookeeper: cd /opt/kafka ./bin/zooke

Structured Streaming to Save JSON to HDFS

My Structured Spark Streaming program is to read JSON data from Kafka and write to HDFS in JSON format. I am able to save JSON to HDFS but it saves the JSON st

No qualifying bean of type 'org.springframework.kafka.core.ProducerFactory<java.lang.Object, java.lang.Object>'

I use this kafka configuration with spring cloud and spring boot 2.6.6: @Configuration @RefreshScope public class KafkaProducerConfig { @Bean(name = "nativeP

How to get kafka offset data, specified on timestamp

I've tried to get the offset from Kafka topic based on timestamp when I tried to run it was throwing null pointer error, Map<TopicPartition, Long> timest

How to query the state store in the Kafka Streams DSL to implement consumer idempotency

I'm working in an scenario where duplicated messages could arrive at a consumer (a KStream application). To use the typical case let's suppose it's an OrderCrea

How to monitor the amount of messages in a Kafka topic per day?

I have a Kafka cluster with a topic that receives thousands of messages a day and I want to see how many messages went in the topic per date. I'm using JMX expo

Kafka streams - Concatenate Predicate based on dynamic number of conditions

I'm a bit new in Java so I would appreciate advice to deal with multiple conditions in Kafka Predicates. I've the following code which I'm able to have dynamic