'How to get documents with fields that contains array with more than one value?

I have the following index properties:

{
  "mappings": {
    "properties": {
      "content": { "type": "text" },
      "place": {
        "type": "nested",
        "properties": {
          "home": {
            "type": "text"
          },
          "work": {
            "type": "text"
          }
        }
      }
    }
  }
}

I want to get all the documents that contain more than one value of place field.

For example, if the index contains the following 2 documents:

{
    "content": "First content",
    "place": [
        {
            "home": "home_1",
            "work": "work_1"
        },
        {
            "home": "home_2",
            "work": "home_2"
        }
    ]
},

{
    "content": "Second content",
    "place": [
        {
            "home": "home_3",
            "work": "work_3"
        }
    ]
}

I want to write a query that gives us the first document (because it contains place field with an array of size 2).

I have tried:

"filter": {
            "script": {
                "script": "doc['place'].values.length > 1"
            }
        }

But I got error:

RequestError(400, 'parsing_exception', 'Unknown key for a START_OBJECT in [filter].')


Solution 1:[1]

Option 1: Using function_score

You can add exits query and check if home field is available in each element of place. If it is available, then it will score with 1 and you can use score_mode to sum so if there are 2 elements, then the sum will be 2 and if there are 1 element then sum will be 1. Last you will set min_score to 2 so all documents which have scored 2 will come in response.

POST counttest/_search
{
  "query": {
    "function_score": {
      "query": {
        "nested": {
          "path": "place",
          "query": {
            "exists": {
              "field": "place.home"
            }
          },
          "score_mode": "sum"
        }
      },
      "functions": [
        {
          "script_score": {
            "script": {
              "source": "_score > 1 ? 2 : 0"
            }
          }
        }
      ],
      "boost_mode": "replace"
    }
  },
  "min_score": 2
}

Option 2: Using Script Field

This will add new fields with value true or false and application side you can make the decision to show documents or not.

POST counttest/_search
{
  "_source": "*", 
  "script_fields": {
    "new_place": {
      "script": {
        "lang": "painless",
        "source": "params['_source']['place'].size() > 1"
      }
    }
  }
}

Sample Response:

"hits" : [
      {
        "_index" : "counttest",
        "_type" : "_doc",
        "_id" : "ZXKc1oAB4onl0QH9hzaW",
        "_score" : 1.0,
        "_source" : {
          "place" : [
            {
              "work" : "work_1",
              "home" : "home_1"
            },
            {
              "work" : "home_2",
              "home" : "home_2"
            }
          ],
          "content" : "First content"
        },
        "fields" : {
          "new_place" : [
            true
          ]
        }
      },
      {
        "_index" : "counttest",
        "_type" : "_doc",
        "_id" : "ZnKv1oAB4onl0QH9HjaD",
        "_score" : 1.0,
        "_source" : {
          "place" : [
            {
              "work" : "work_3",
              "home" : "home_3"
            }
          ],
          "content" : "Second content"
        },
        "fields" : {
          "new_place" : [
            false
          ]
        }
      }
    ]

Option 3: Store count at index time (Best Solution)

You can store a number of elements available in place field while indexing document itself. Once you store count with the document, it will be very easy to filter a document and it will not add load to response time.

Sample Document

{
    "content": "First content",
    "place_count" : 2,
    "place": [
        {
            "home": "home_1",
            "work": "work_1"
        },
        {
            "home": "home_2",
            "work": "home_2"
        }
    ]
},

{
    "content": "Second content",
    "place_count" : 1,
    "place": [
        {
            "home": "home_3",
            "work": "work_3"
        }
    ]
}

Query:

POST counttest/_search
{
  "query": {
    "range": {
      "place_count": {
        "gte": 2
      }
    }
  }
}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Sagar Patel