Category "flink-streaming"

Flink Python Datastream API Kafka Producer Sink Serializaion

I'm trying to read data from one kafka topic and writing to another after making some processing. I'm able to read data and process it when i try to write it to

No ExecutorFactory found to execute the application in Flink 1.11.1

first of all I have read this post about the same issue and tried to follow the same solution that works for him (create a new quickstart with mvn and migrate t

Create pyFlink DataStream Consumer from Tweets Kafka Producer in Python

I want to create I stream kafka consumer in pyFlink, which can read tweets data after deserialization (json), I have pyflink version 1.14.4 (last version) Can I

if we cancel the job with savepoint, job got cancelled and savepoint was failure how to restore this job now

I have deployed flink job in application mode using native kubernetes deployment and stopping job along with savepoint (I'm using rest api command for that) but

Flink TaskManager not reconnecting to the new Jobmanager

I have configured Flink in HA mode as mentioned here: I wanted to test the fault tolerance, hence I did the following: Setup Flink cluster with 2 JobManagers

Flink checkpoint not replaying the kafka events which were in process during the savepoint/checkpoint

I want to test end-to-end exactly once processing in flink. My job is: Kafka-source -> mapper1 -> mapper-2 -> kafka-sink I had put a Thread.sleep(100

Apache Flink - writing stream to S3 error - null uri host

I have a Flink data pipeline that transforms the log file downloaded from S3 and write back in parquet file format to another S3 bucket. I have configured the S

Integration testing flink job

I've written a small flink application. I has some input, and enriches it with data from an external source. It's an RichAsyncFunction and within the open metho

Apache Flink: AWS S3 timeout exception when starting a job from a savepoint

I have a Flink job which has large state in a Map operator. We are taking savepoint which has around 80GB storing to AWS S3. We have around 100 parallelism for

Flink Missing Events With Windowed Processor(Event Time Windows) and Kafka Source

We have a Streaming Job that has 20 separate pipelines, with each pipeline having one/many Kafka topic sources and with some pipelines having Windowed Processor