'Idempotent Camel AWS2-S3 consumer not retrieving all files from bucket when deleteAfterRead is false

I am using Camel AWS2-S3 with an idempotent consumer based on MemoryIdempotentRepository and try to read all files from an AWS/S3 bucket using Camel AWS2-S3 version 3.14.2. The intention is to keep the retrieved files in the bucket and therefore, deleteAfterRead is set to false. When debugging my code I see consecutive polls retrieving always only one single file which is always the same. After the second poll this file is correctly considered a duplicate and ignored. In my DEBUG log I see "Ignoring duplicate message with id..." for this file. However, although there are more files in my AWS/S3 bucket to retrieve, nothing happens until the polling delay initiating the next poll expires. But then the next poll retrieves only the same already marked duplicate file and so on ...

What is needed to make the s3client keep on retrieving the next file in the bucket after ignoring the previous one. All versions of Camel starting at 3.9.0 show the same behavior.

Has anyone run into the same problem and know how to solve this?

Many thanks in advance.



Solution 1:[1]

Apache Camel AWS2 S3 Component should work properly polling AWS S3 bucket files as needed.

Having only one file polled should mean:

  • Either you specified both bucketName and fileName when configuring your consumer which will resolve to polling the same file again and again
  • Set the maxMessagesPerPoll to 1 while not deleting the processed messages which will as well resolve to polling the same file over and over
  • Either set a prefix or delimiter configuration option matching a single file

You should start inspecting your endpoint configuration to see if any of these mis-configuration options apply.

On the other hand, to achieve idempotency you have one of two ways:

  • Configure your consumer endpoint to moveAfterRead so that processed files are copied to a destination bucket avoiding them to be re-polled on next iteration (you need to configure the destination bucket). This solution would depend on your requirements and should be applicable only if you afford to have files in another bucket meaning no other component / service is relying on those files bien in the same source bucket
  • Configure a IdempotentConsumer backed by a repository of your choice in which case you can mark (and save) processed messages

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 tmarwen