'Why does a large "scroll" query fail with "circuit_breaking_exception" yet "reindex" API succeeds?

I'm testing the ElasticSearch connector for Presto and running into an issue with queries that return a large number of documents. These queries fail due to a circuit_breaking_exception. For example, a query that filters on a single field (accountId stored as a number) and returns four high cardinality fields fails in this manner. After reading the docs, my understanding is that this exception is related to a node running out of memory and ES preemptively stopping the query to prevent an OOM exception.

Beneath the hood, the ElasticSearchConnector is using scroll queries.

Recently, a colleague of mine reindexed this account using the reindex API . My understanding is that this API uses a combination of scroll queries and the batch API to perform reindexing.

If my current understanding is correct, I'm confused as to why the reindex API would succeed while a vanilla scroll query would fail.

I understand that I could tinker with the memory available to the ES nodes in conjunction with increasing the circuit breaker memory limit, but before I go throwing hardware at the problem I'd like to make sure I'm not missing some fundamental difference between these APIs that causes one to succeed and the other to fail.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source