'Is there a way to write an operator in apache beam which runs differently in different runner?
For example, lets say I want to enrich a collection with more values from a lookup table, by joining on a key. In the spark runner, I would prefer to do a broadcast join for this operator where as in the flink runner, I would like to make rpc calls (say to redis) to load the values based on the key.
So is there a way to achieve this? Same logical semantics but different execution based on the runner.
Solution 1:[1]
In any case, an internal Beam implementation for different transforms depends on runner but if you need it for your own transforms than you can use PipelineOptions to get a runner name and decide which code path to take.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Alexey Romanenko |
