'In Kafka, why we don't have parallelism while consuming a partition?
Here is my understanding about consuming information from a topic in Kafka.
A consumer group is responsible for reading information from a single topic. If a topic has 5 partitions and if there are five consumers in the consumer group, each one reads information from one whole partition. If I add another consumer in the consumer group, the newly added consumer is going to be in idle state.
Instead of placing the new consumer in idle state, why Kafak does not allow consuming info from a partition which is already consumed by a different consumer ? If this happend, there will be more parallelism.
So in jist: In one consumer consumer group, why Kafka does not allow more than one consumers reading information from a single partition ?
Thanks!
Solution 1:[1]
In many use cases, kafka partitions are also used to provide ordering on specific keys. For e.g. while processing events for some users we might want to process events in parallel overall but in order for a user (user update events can be processed only after user creation event). In such scenarios, we would use user id as partition key so that all event for that user go to one specific partition and hence can be processed in order.
If this is not your use case, you can always read events in bulk in your consumer and process them in parallel manner.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
