'Convert directives like "+" in a search string when using the query DSL?
When using the query string format (i.e. everything in the URL) you simply convert to "percent encoding":
"+bubbles +bananas" --> urllib.parse.quote(search_string.encode('utf8')) --> "%2Bbubbles%20%2Bbananas".
But when using the query DSL this doesn't seem to work:
data = {'match':
{'field1' : "%2Bbubbles%20%2Bbananas"
}
}
... then produces results which include "20" as one of the search terms, in fact the same results as for the search string "2bbubbles 20 2bbananas".
But... doing the following doesn't work either:
data = {'match':
{'field1' : "+bubbles +bananas"
}
}
... produces the same result as you get if you put "bubbles bananas".
Given a search string like "+bubbles +bananas", how do you convert this to query DSL?
I am guessing currently that it possiby equates to using filters or something. All well and good if so, but then how would one translate MINUS ("-") to query DSL? E.g. "+bubbles +bananas -fruitcake"?
Solution 1:[1]
Elasticsearch analyzers use standard tokenizer as default. If you create a custom analyzer and use whitespace tokenizer as tokenizer in your analyzer you can achieve this. Here is the result of the analyzer endpoint for standard and whitespace tokenizers.
GET {index}/_analyze
{
"tokenizer": "standard",
"text": ["+bubbles +bananas"]
}
Result:
{
"tokens" : [
{
"token" : "bubbles",
"start_offset" : 1,
"end_offset" : 8,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "bananas",
"start_offset" : 10,
"end_offset" : 17,
"type" : "<ALPHANUM>",
"position" : 1
}
]
}
GET {index}/_analyze
{
"tokenizer": "whitespace",
"text": ["+bubbles +bananas"]
}
Result:
{
"tokens" : [
{
"token" : "+bubbles",
"start_offset" : 0,
"end_offset" : 8,
"type" : "word",
"position" : 0
},
{
"token" : "+bananas",
"start_offset" : 9,
"end_offset" : 17,
"type" : "word",
"position" : 1
}
]
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | YD9 |
