'ElasticSearch - Aggregation using a sample size
I have an ElasticSearch database that has a large number of records. My data has a numeric score and label. I am aggregating the median and quartiles of the score field for a specific label. For example, what is the median score of all documents with the label foo.
Does anyone know if it's possible to base those percentile aggregations on a statistically significant sample of the data, instead of the full set, in order to speed up the aggregation query?
Thanks!
Solution 1:[1]
You can use a Term Query or any other query to filter the documents and apply aggregation to those results. That way the aggression is not applied to all documents only to the query result.
{
"query": {
"term": {
"field_name": {
"value": "foo"
}
}
},
"aggs": {
"name_example": {
"avg": {
"field": "field_name"
}
}
}
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | andrecoelho.rabbitbr |
