'Elasticsearch - must_not is not affecting score on nested fields

I'm trying to query some date ordered by the best score based on the following rules:

  1. approvals should be action=accepted or type=a1 or type=a2
  2. approvals should not match any action=rejected

I'm not filtering out any data from the query, I'm just trying to get the best mathes first.

Mapping:

PUT test
{
  "mappings": {
    "properties": {
      "savedAt": {
        "type":   "date",
        "format": "yyyy-MM-dd'T'HH:mm:ss.SSSZ"
      },
      "approvals": {
        "type": "nested",
        "properties": {
          "action": {
            "type":   "text",
            "analyzer": "keyword"
          },
          "by": {
            "type":   "text",
            "analyzer": "keyword"
          },
          "type": {
            "type":   "text",
            "analyzer": "keyword"
          }
        }
      }
    }
  }
}

Data:

POST test/_create/1
{
   "savedBy": "Donatello",
   "savedAt": "2022-04-18T19:09:27.527+0200",
   "approvals": [
      {
         "action": "approved",
         "type": "a1",
         "by": "Raphael"
      },
      {
         "action": "approved",
         "type": "a2",
         "by": "Michelangelo"
      }
   ]
}

POST test/_create/2
{
   "savedBy": "Michelangelo",
   "savedAt": "2022-04-19T19:09:27.527+0200",
   "approvals": [
      {
         "action": "approved",
         "type": "a1",
         "by": "Raphael"
      },
      {
         "action": "rejected",
         "type": "a2",
         "by": "Leonardo"
      }
   ]
}

POST test/_create/3
{
   "savedBy": "Raphael",
   "savedAt": "2022-04-20T19:09:27.527+0200",
   "approvals": [
      {
         "action": "approved",
         "type": "a1",
         "by": "Leonardo"
      }
   ]
}

Query:

GET test/_search
{
  "sort" : [
    "_score",
    { "savedAt" : "desc" }
  ],
  "query": {
    "bool": {
      "should": [
        {
          "nested": {
            "path": "approvals",
            "query": {
              "bool": {
                "must_not": [
                   {
                      "term": {
                        "approvals.action": {
                          "value": "rejected"
                        }
                      }
                    }
                ], 
                "should": [
                    {
                      "term": {
                        "approvals.action": {
                          "value": "approved"
                        }
                      }
                    },
                    {
                      "term": {
                        "approvals.type": {
                          "value": "a1"
                        }
                      }
                    },
                    {
                      "term": {
                        "approvals.type": {
                          "value": "a2"
                        }
                      }
                    }
                ],
                "minimum_should_match": 2
              }
            }
          }
        }
      ],
      "minimum_should_match": 0
    }
  }
}

Response:

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.99491465,
        "_source" : {
          "savedBy" : "Donatello",
          "savedAt" : "2022-04-18T19:09:27.527+0200",
          "approvals" : [
            {
              "action" : "approved",
              "type" : "a1",
              "by" : "Raphael"
            },
            {
              "action" : "approved",
              "type" : "a2",
              "by" : "Michelangelo"
            }
          ]
        },
        "sort" : [
          0.99491465,
          1650301767527
        ]
      },
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 0.8266785,
        "_source" : {
          "savedBy" : "Raphael",
          "savedAt" : "2022-04-20T19:09:27.527+0200",
          "approvals" : [
            {
              "action" : "approved",
              "type" : "a1",
              "by" : "Leonardo"
            }
          ]
        },
        "sort" : [
          0.8266785,
          1650474567527
        ]
      },
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.8266785,
        "_source" : {
          "savedBy" : "Michelangelo",
          "savedAt" : "2022-04-19T19:09:27.527+0200",
          "approvals" : [
            {
              "action" : "approved",
              "type" : "a1",
              "by" : "Raphael"
            },
            {
              "action" : "rejected",
              "type" : "a2",
              "by" : "Leonardo"
            }
          ]
        },
        "sort" : [
          0.8266785,
          1650388167527
        ]
      }
    ]
  }
}

You can see that both data id=2 & id=3 have the same score (_score" : 0.8266785)

I was expecting id=2 would have the lowest score since it has action=rejected (stated in the must_not criteria)

Could someone one explain me how Elasticsearch is scoring in this case, please?



Solution 1:[1]

must_not does not contribute to scoring.

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html

must_not

The clause (query) must not appear in the matching documents. Clauses are executed in filter context meaning that scoring is ignored and clauses are considered for caching. Because scoring is ignored, a score of 0 for all documents is returned.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ThoughtfulHacking