'Avoid part of a string search in elasticsearch
I have a scenario where i want to search for 'bank of india' and documents retrieved have hits for 'reserve bank of india', 'state bank of india', etc. Basically the search string named entity is part of another named entity as well.
What are the ways to avoid it in elasticsearch?
Solution 1:[1]
If you use keyword
type instead of text
as the mapping for your entity field you will no longer have those partial matches. keyword
says treat this text like a single unit (named entities are like this), while text
says treat each word as a unit and consider the field as a bag of words, So the query looks for the most word matches, regardless of order or if all of the words are there. There are different queries that can get at that requiring order (match_phrase
) and requiring all words to be matches (minimum_should_match
parameter), but I like to use the term
query if you follow the keyword
mapping strategy. Does that make sense?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Nate |