'Creating a Flink DataStream from database query results

In my problem I need to query a database and join the query results with a Kafka data stream in Flink. Currently this is done by storing the query results in a file and then use Flink's readFile functionality to create a DataStream of query results. What could be a better approach to bypass the intermediary step of writing to file and create a DataStream directly from query results?

My current understanding is that I would need to write a custom SourceFunction as suggested here. Is this the right and only way or are there any alternatives?

Are there any good resources for writing the custom SoruceFunctions or should I just look at current implementations for reference and customise them fro my needs?



Solution 1:[1]

One straightforward solution would be to use a lookup join, perhaps with caching enabled.

Other possible solutions include kafka connect, or using something like Debezium to mirror the database table into Flink. Here's an example: https://github.com/ververica/flink-sql-CDC.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 David Anderson