'Parallelization factor: AWS Kinesis data streams to Lambda

I'm very confused with the concept of ParallelizationFactor.


My understanding

https://stackoverflow.com/a/57534322/13000229
In the past, one KDS shard can send data to only one Lambda instance/invocation. More than one Lambda instance getting data from the same KDS shard can't run concurrently.

https://aws.amazon.com/blogs/compute/new-aws-lambda-scaling-controls-for-kinesis-and-dynamodb-event-sources/
In Nov 2019, a new parameter ParallelizationFactor (Concurrent batches per shard) came out.

The default factor of one exhibits normal behavior. A factor of two allows up to 200 concurrent invocations on 100 Kinesis data shards.


Questions

  1. By using ParallelizationFactor, can more than one Lambda instance get different data from the same KDS shard concurrently?
    For example, the shard has data d1, d2, d3 d4, d5 and d6, and we assume BatchSize = 2 and ParallelizationFactor = 2. Lambda instance A can consume d1 and d2, while Lambda instance B can consume d3 and d4 at the same time. Then once Lambda instance A finishes the first batch, it starts processing d5 and d6 and so on.

Expected process flow

  1. If Question 1 is correct, what might be sacrificed? (e.g. the order in the same shard, one piece of data may be processed more than once)

  2. If Question 1 is not correct, how will data in KDS shards be processed by Lambda concurrently?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source