'ElasticSearch: Result window is too large
My friend stored 65000 documents on the Elastic Search cloud and I would like to retrieve all of them (using python). However, when I am running my current script, there is an error noticing that :
RequestError(400, 'search_phase_execution_exception', 'Result window is too large, from + size must be less than or equal to: [10000] but was [30000]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.')
My script
es = Elasticsearch(cloud_id=cloud_id, http_auth=(username, password))
docs = es.search(body={"query": {"match_all": {}}, '_source': ["_id"], 'size': 65000})
What would be the easiest way to retrieve all those document and not limit it to 10000 docs? thanks
Solution 1:[1]
The limit has been set so that the result set does not overwhelm your nodes. Results will occupy memory in the elastic node. So bigger the result set, bigger the memory footprint and impact on the nodes.
Depending on what you want to do with the retrieved documents,
try to use the
scroll
api (as suggested in your error message) if its a batch job. Be mindful of the lifetime ofscroll context
in that case.or, use the
Search After
Solution 2:[2]
You should use the scroll API and get the results in different calls. The scroll API will return to you the results 10000 by 10000 as maximum (that will be available to consult during the amount of time you indicate in the call) and you will be able then to paginate the results and obtain them thanks to a scroll_id.
Solution 3:[3]
The error message itself is mentioning that how can you solve the issue, look carefully this part of the error message.
This limit can be set by changing the [index.max_result_window] index level setting.
Please refer update indices level setting on how to change that.
So for your setting it would look like:
PUT /<your-index-name>/_settings
{
"index" : {
"index.max_result_window" : 65000 -> note its equal to your all the docs in your index
}
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | sgalinma |
Solution 3 |