'Query Doesn't Match Numbers In Text

Match queries can find strings that contain numbers, in this case, I am trying to search matching phone numbers. Mappings and analyzers are provided below. For example, I have an index as follows

{
  "userId": 126817,
  "name": "Test User",
  "phoneNumber": "5551112233",
}

When I use match query doesn't match anything

{"match" : {"phoneNumber": {"query": "555"}}}

When I use prefix value it does match

{"prefix" : {"phoneNumber": {"value ": "555"}}}

Analyze Results

{
    "tokens": [
        {
            "token": "5551112233",
            "start_offset": 0,
            "end_offset": 10,
            "type": "<NUM>",
            "position": 0
        }
    ]
}

Mapping

{
      index: "user-clinics",
      type: "user-clinic",
      body: {properties: {id: {type: "long"}} }
}

Analyzers

const TurkishAnalyzer = {
  analysis: {
    filter: {
      my_ascii_folding: {
        type: "asciifolding",
        preserve_original: true
      }
    },
    analyzer: {
      turkish_analyzer: {
        tokenizer: "standard",
        filter: ["lowercase", "my_ascii_folding"]
      }
    }
  }
};

const AutoCompleteAnalyzer = {
  analysis: {
    filter: {
      autocomplete_filter: {
        type: "edge_ngram",
        min_gram: 1,
        max_gram: 20
      }
    },
    analyzer: {
      autocomplete_search: {
        type: "custom",
        tokenizer: "standard",
        filter: ["lowercase"]
      },
      autocomplete_index: {
        type: "custom",
        tokenizer: "standard",
        filter: ["lowercase", "autocomplete_filter"]
      }
    }
  }
};


Solution 1:[1]

It's because edge_ngram tokenizes only from the beginning of the token, hence all prefixes will be indexed, i.e. a, as, asd, asd1, asd12, asd123

You need to change your autocomplete_filter to ngram if you also want to be able to match inside tokens, i.e. d12 or 123.

Beware, though, that this might generate a lot more tokens

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Val