'Creating a Flink DataStream from database query results
In my problem I need to query a database and join the query results with a Kafka data stream in Flink. Currently this is done by storing the query results in a file and then use Flink's readFile functionality to create a DataStream of query results. What could be a better approach to bypass the intermediary step of writing to file and create a DataStream directly from query results?
My current understanding is that I would need to write a custom SourceFunction as suggested here. Is this the right and only way or are there any alternatives?
Are there any good resources for writing the custom SoruceFunctions or should I just look at current implementations for reference and customise them fro my needs?
Solution 1:[1]
One straightforward solution would be to use a lookup join, perhaps with caching enabled.
Other possible solutions include kafka connect, or using something like Debezium to mirror the database table into Flink. Here's an example: https://github.com/ververica/flink-sql-CDC.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | David Anderson |
