'Elasticsearch - grouping by multiple fields
I thought it would be a simple operation to Elasticsearch queries, but grouping multiple fields doesn't like to be that trivial.
I looking for a way to query the latest (based on savedAt field) data for the combination of type, size and category fields.
Data:
POST data/_create/1
{
"content": "some text ...",
"type": "regular",
"size": "medium",
"category" : "a1",
"savedAt": "2022-01-02 15:09:27.527+0200"
}
POST data/_create/2
{
"content": "some other text ...",
"type": "regular",
"size": "small",
"category" : "a1",
"savedAt": "2022-01-02 16:09:27.527+0200"
}
POST data/_create/3
{
"content": "some other text ...",
"type": "regular",
"size": "big",
"category" : "a1",
"savedAt": "2022-01-02 19:09:27.527+0200"
}
POST data/_create/4
{
"content": "some other different text ...",
"type": "regular",
"size": "big",
"category" : "a1",
"savedAt": "2022-01-02 20:09:27.527+0200"
}
I expect to get as response data with indexes 1, 2 and 4 for the combinations:
- regular - medium - a1
- regular - small - a1
- regular - big - a1
I can't use collapse, it doesn't support multiple fields.
I tried to use aggregations:
GET data/_search
{
"size": 0,
"aggs": {
"agg1": {
"terms": {
"field": "type.keyword"
},
"aggs": {
"agg2": {
"terms": {
"field": "size.keyword"
},
"aggs": {
"agg3": {
"terms": {
"field": "category.keyword"
}
}
}
}
}
}
}
}
But, it doesn't return anything:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"my-agg-name" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [ ]
}
}
}
Any suggestion?
UPDATE
This is the mapping being used for this data
PUT data
{
"mappings": {
"properties": {
"savedAt": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss.SSSZ"
},
"type": {
"type": "text",
"analyzer": "keyword"
},
"size": {
"type": "text",
"analyzer": "keyword"
},
"category": {
"type": "text",
"analyzer": "keyword"
}
}
}
}
Solution 1:[1]
you should index your doc by replacing _create with _doc. After this change your query will work.
POST data/_doc/1
{
"content": "some text ...",
"type": "regular",
"size": "medium",
"category" : "a1",
"savedAt": "2022-01-02 15:09:27.527+0200"
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | andrecoelho.rabbitbr |
