'How to get documents with fields that contains array with more than one value?
I have the following index properties:
{
"mappings": {
"properties": {
"content": { "type": "text" },
"place": {
"type": "nested",
"properties": {
"home": {
"type": "text"
},
"work": {
"type": "text"
}
}
}
}
}
}
I want to get all the documents that contain more than one value of place field.
For example, if the index contains the following 2 documents:
{
"content": "First content",
"place": [
{
"home": "home_1",
"work": "work_1"
},
{
"home": "home_2",
"work": "home_2"
}
]
},
{
"content": "Second content",
"place": [
{
"home": "home_3",
"work": "work_3"
}
]
}
I want to write a query that gives us the first document (because it contains place field with an array of size 2).
I have tried:
"filter": {
"script": {
"script": "doc['place'].values.length > 1"
}
}
But I got error:
RequestError(400, 'parsing_exception', 'Unknown key for a START_OBJECT in [filter].')
Solution 1:[1]
Option 1: Using function_score
You can add exits query and check if home field is available in each element of place. If it is available, then it will score with 1 and you can use score_mode to sum so if there are 2 elements, then the sum will be 2 and if there are 1 element then sum will be 1. Last you will set min_score to 2 so all documents which have scored 2 will come in response.
POST counttest/_search
{
"query": {
"function_score": {
"query": {
"nested": {
"path": "place",
"query": {
"exists": {
"field": "place.home"
}
},
"score_mode": "sum"
}
},
"functions": [
{
"script_score": {
"script": {
"source": "_score > 1 ? 2 : 0"
}
}
}
],
"boost_mode": "replace"
}
},
"min_score": 2
}
Option 2: Using Script Field
This will add new fields with value true or false and application side you can make the decision to show documents or not.
POST counttest/_search
{
"_source": "*",
"script_fields": {
"new_place": {
"script": {
"lang": "painless",
"source": "params['_source']['place'].size() > 1"
}
}
}
}
Sample Response:
"hits" : [
{
"_index" : "counttest",
"_type" : "_doc",
"_id" : "ZXKc1oAB4onl0QH9hzaW",
"_score" : 1.0,
"_source" : {
"place" : [
{
"work" : "work_1",
"home" : "home_1"
},
{
"work" : "home_2",
"home" : "home_2"
}
],
"content" : "First content"
},
"fields" : {
"new_place" : [
true
]
}
},
{
"_index" : "counttest",
"_type" : "_doc",
"_id" : "ZnKv1oAB4onl0QH9HjaD",
"_score" : 1.0,
"_source" : {
"place" : [
{
"work" : "work_3",
"home" : "home_3"
}
],
"content" : "Second content"
},
"fields" : {
"new_place" : [
false
]
}
}
]
Option 3: Store count at index time (Best Solution)
You can store a number of elements available in place field while indexing document itself. Once you store count with the document, it will be very easy to filter a document and it will not add load to response time.
Sample Document
{
"content": "First content",
"place_count" : 2,
"place": [
{
"home": "home_1",
"work": "work_1"
},
{
"home": "home_2",
"work": "home_2"
}
]
},
{
"content": "Second content",
"place_count" : 1,
"place": [
{
"home": "home_3",
"work": "work_3"
}
]
}
Query:
POST counttest/_search
{
"query": {
"range": {
"place_count": {
"gte": 2
}
}
}
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Sagar Patel |
