'Does ksqldb 's custom udaf function guarantee concurrency(thread safe)?
I'm running 5 ksqldb instance(k8s), and each instance thread(ksql server properties) num is 3
I implemented the udaf function to aggregate a simple map object using the udaf function. Data corruption did not occur when more than 10,000 pieces of data per minute were aggregated through the udaf function in a cluster environment. My guess is that the udfa function seems to guarantee concurrency, am I right?
I have one more question I am currently running a ksqldb instance in the k8s environment. Will the table aggregate data of ksqldb work normally without loss even in the restart situation?
Solution 1:[1]
To answer your first question, ksqlDB creates an new instance of called UDAFs and uses them in a single-threaded manner; ksqlDB does not re-use UDAFs.
This means that if you as an implementator write a UDAF which does not use global state, then, "yes", your UDAF should be thread-safe.
For your second question, I believe the answer is "yes". UDAFs use the aggregate function to persist intermediate state to state stores; that ought to be restored when a ksqlDB node is restarted.
That said, technically, in either case, one could write a UDAF which fails to be thread-safe or does something really weird and would not restore properly.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | GeoJim |
