'How many partitions and throughput for 50,000 events per second for Azure Event Hub?

I am working on a project where we need to handle around 50000 events per second. We decided to go with the Azure Event hub. Currently, I am doing the POC to analyze the cost of using the Azure event hub. From the documentation, I read the following:

1 TU = 1MB/s or 1000 events (Which occurred first)

I need to decide how many event hubs with partitions and TU I should use for this particular case. The max size of an event is 256KB.

I decided to use 50 TU with 10 event hub with 10 partitions.

Is the above units are correct and How I can calculate the cost?

Also, I am looking for suggestions in order to handle a large number of events. We will store these events in a database.



Solution 1:[1]

Your TU calculation is correct however a Standard SKU Event Hubs namespace can provide up to 40 TUs. Since you need more than 40 TUs it is better if you can create a dedicated Event Hubs cluster instead. You can find more about Event Hubs clusters here - https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-dedicated-overview

One more thing. Plan 1 MB/sec max ingress per partition. I recommend you start with 64 partitions and scale out as needed.

Solution 2:[2]

To calculate how much partitions you would need, I suggest using Little's law. I am assuming processing all those 50k events are independent and locks are not involved.

You should the know the average time it takes for you to successfully process an event.

For ex, suppose processing an event takes 10msec so sequentially you would be able to process 100 events / sec in a single thread.

To be able to handle 50K events, you would need to process 50000 / 100 events concurrently = 500

Applying this you can have 500 partitions in your eventhub (v5-java sdk of eventhub will give you a thread / partition).

I've worked on a similar use case where we needed to support a maximum of 10K events / sec and dump it to a database and the average time to process a single event was ~8msec.

So in one thread we could sequentially process ~125 events / sec and we have 80 (10000 / 125) partitions in that eventhub. Our throughput then could reach a max of ~10k events / sec (numbers might not match exactly but they will be close to that).

enter image description here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Serkant Karaca
Solution 2