'Why is my elasticsearch update_by_query timing out?
I am using the Javascript library and and trying to perform a rather large update on documents. I have changed the timeout value on the query itself as well as the client object and am still getting a timeout that seems to be 2 minutes. Below is the query.
const arrayOfStrings = [...]; // This array could be between 10-12k elements in size
await elasticClient.updateByQuery({
index: "main-index",
refresh: true,
conflicts:"proceed",
script: {
lang: "painless",
source: "ctx._source.status = \"1\"; ctx._source.entities = []; for(term in params.array) ctx._source.entities.add(term);",
params: {
"array": arrayOfStrings
}
},
query: {
"term": {
"parent_id": 'dgd39gd3-db3dg23879-d893gdg38-e23ed' // There could be upwards of 5000 documents that match this criteria
}
},
timeout: "5m"
});
Also on the elastic client, I have requestTimeout set to '5m' as well. I can understand why this particular query may be timing out as its applying a 12k element array to a field on 5k documents. However I dont understand why the query is timing out after only 2 minutes when I have timeout values set that are longer.
Solution 1:[1]
What you need to do here is to run the update by query asynchronously
await elasticClient.updateByQuery({
index: "main-index",
refresh: true,
conflicts:"proceed",
waitForCompletion: false <---- add this setting
And then you can follow the progress of the task running asynchronously.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Val |
