'Using GenStage/Flow for soft-realtime event processing

Im currently building a soft-realtime event processing system using Elixir and now Im trying to wrap my head around GenStage/Flow to understand if that is the correct abstractions to build on. Unfortunately the examples of larger applications are sparse and most are about parallel processing of stale data. Im using an infinite event stream as Source.

My question is if it makes sense to build on GenStage/Flow for a case where im subscribing on an infinite external event stream that pushes events to my application. I want the events to be processed right away when they arrive at the server. That is I dont want to buffer them until I get 500 for Flow to start demand. But does it make sense to use a min demand of 1?



Solution 1:[1]

I would say that for anything near real-time GenStage won't work. The core idea is to postpone evaluation until there's a demand (say, worker is ready). If you want to process them right away, just spawn new Elixir process for every event and pray that scheduler doesn't choke :)

Even if you use the min demand of 1, it will process sequentially and you'll get overflow events buffered. Yes, you can parallelize, but I'm not sure whether you'll have to start parallel stages right away or not. And again, when you reach N simultaneous events having N parallel workers, the events will get buffered.

UPDATE After a bit of consideration I think that in some cases GenStage or its higher order derivative Flow could still be useful for near real-time processing. To avoid overhead on process creation, one could use fixed width windows to collect and partition events, which then could be processed by different consumers, pools or even by individual processes if needed. The only drawback or rather constraint is that this will introduce quantization, which can be as low as 1ms: https://github.com/elixir-lang/flow/blob/v0.14.2/lib/flow/window.ex#L324 However this is just a theoretical speculation, I haven't tried this.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1