'Understanding Kafka consumer poll behaviour

I have a consumer that polls records from a Kafka topic, and I am doing the following:

  1. Assign the consumer to a specific partition in a kafka topic.
  2. Seek to a specific offset in the past (so definitely there are records to poll).
  3. I am executing the following code:
while (true) 
{
    ConsumerRecords<GenericRecord, GenericRecord> items= 
    consumer.poll(Duration.ofMillis(300));
    log.info("Polled {} items", items.count());
} 

I get the following log:

Polled 0 items                                                                                                                      
Polled 0 items                                                                                                                      
Polled 0 items
Polled 0 items            
.
.
.                                                                                                          
Polled 0 items
Polled 3620 items

I just wanted to understand the behavior of the poll and why it got the 0 messages for many tries then it got the records at a later point in time? (remember I seeked to an offset which is in the past).

This is my consumer configuration:

{
  schema.registry.url=https://schema-registry.*********,
  enable.auto.commit=false,
  max.poll.records=65536,
  group.id=**************,
  fetch.max.wait.ms=5000,
  bootstrap.servers=********
  fetch.min.bytes=1048576,
  fetch.max.bytes=1048576,
  auto.offset.reset=earliest
}


Solution 1:[1]

You should try printing what offsets are actually being consumed to verify the seek worked.

You shouldn't need to seek anywhere, unless you're re-using a group.id (and committing offsets somewhere in that loop) each run since you are setting auto.offset.reset=earliest.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 OneCricketeer