'Elasticsearch in Python, is there a phrase suggestor?
Following this link here. There is a concept called "Phrase Suggestor" which uses some N-Gram methods to give you a suggestions sort of like autocompletion. I was trying to see how to use the api that Python offers docs found here. But I could not find anything mentioning n-gram or phrase suggestor.
Does this method exist in the Python Elasticsearch API? I am aware of NLTK and the n-gram methods there.
Here is what I have.
First connecting, this block of code works fine
from elasticsearch import Elasticsearch
CLOUD_ID = 'My_deployment:...'
ELASTIC_PASSWORD = 'password'
es = Elasticsearch(cloud_id=CLOUD_ID,
basic_auth=("elastic", ELASTIC_PASSWORD))
This second block does not work
text = 'noble prize'
suggest_dictionary = {"simple_phrase" : {
'text' : text,
"phrase" : {
"field" : "title.trigram"
}
}
}
query_dictionary = {'suggest' : suggest_dictionary}
res = es.search(
index='test',
body=query_dictionary)
print(res)
The error message is this
<ipython-input-29-05c434577314>:12: DeprecationWarning: The 'body' parameter is deprecated and will be removed in a future version. Instead use individual parameters.
res = es.search(
---------------------------------------------------------------------------
NotFoundError Traceback (most recent call last)
<ipython-input-29-05c434577314> in <module>
10 query_dictionary = {'suggest' : suggest_dictionary}
11
---> 12 res = es.search(
13 index='test',
14 body=query_dictionary)
~/anaconda3/lib/python3.8/site-packages/elasticsearch/_sync/client/utils.py in wrapped(*args, **kwargs)
402 pass
403
--> 404 return api(*args, **kwargs)
405
406 return wrapped # type: ignore[return-value]
~/anaconda3/lib/python3.8/site-packages/elasticsearch/_sync/client/__init__.py in search(self, index, aggregations, aggs, allow_no_indices, allow_partial_search_results, analyze_wildcard, analyzer, batched_reduce_size, ccs_minimize_roundtrips, collapse, default_operator, df, docvalue_fields, error_trace, expand_wildcards, explain, fields, filter_path, from_, highlight, human, ignore_throttled, ignore_unavailable, indices_boost, lenient, max_concurrent_shard_requests, min_compatible_shard_node, min_score, pit, post_filter, pre_filter_shard_size, preference, pretty, profile, q, query, request_cache, rescore, rest_total_hits_as_int, routing, runtime_mappings, script_fields, scroll, search_after, search_type, seq_no_primary_term, size, slice, sort, source, source_excludes, source_includes, stats, stored_fields, suggest, suggest_field, suggest_mode, suggest_size, suggest_text, terminate_after, timeout, track_scores, track_total_hits, typed_keys, version)
3697 if __body is not None:
3698 __headers["content-type"] = "application/json"
-> 3699 return self.perform_request( # type: ignore[return-value]
3700 "POST", __path, params=__query, headers=__headers, body=__body
3701 )
~/anaconda3/lib/python3.8/site-packages/elasticsearch/_sync/client/_base.py in perform_request(self, method, path, params, headers, body)
319 pass
320
--> 321 raise HTTP_EXCEPTIONS.get(meta.status, ApiError)(
322 message=message, meta=meta, body=resp_body
323 )
NotFoundError: NotFoundError(404, 'index_not_found_exception', 'no such index [test]', test, index_or_alias)
The answer provided states to use the PUT test to setup the index. Where? no idea... how? no idea... I am not familiar with that syntax and Python does not seem to be able to recognize it either.
Update
I was able to get it to work finally, but I am confused by the output
{'took': 1, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 0, 'relation': 'eq'}, 'max_score': None, 'hits': []}, 'suggest': {'simple_phrase': [{'text': 'Hi, I need help', 'offset': 0, 'length': 15, 'options': []}]}}
Where is the recommendation? aka the autocompletion to finish the sentence?
Solution 1:[1]
You can first create index mapping with so you not need to depends on external python Ngram and Elasticsearch store field internally after generating NGram.
Index Mapping
PUT test
{
"settings": {
"index": {
"number_of_shards": 1,
"analysis": {
"analyzer": {
"trigram": {
"type": "custom",
"tokenizer": "standard",
"filter": ["lowercase","shingle"]
}
},
"filter": {
"shingle": {
"type": "shingle",
"min_shingle_size": 2,
"max_shingle_size": 3
}
}
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"fields": {
"trigram": {
"type": "text",
"analyzer": "trigram"
}
}
}
}
}
}
You can use below python code for Phrase Suggestor and provided body same as mentioned in documentation.
from elasticsearch import Elasticsearch
es = Elasticsearch()
text = 'noble prize'
suggest_dictionary = {"simple_phrase" : {
'text' : text,
"phrase" : {
"field" : "title.trigram"
}
}
}
query_dictionary = {'suggest' : suggest_dictionary}
res = es.search(
index='test',
body=query_dictionary)
print(res)
Update 1:
The answer provided states to use the PUT test to setup the index. Where? no idea... how? no idea... I am not familiar with that syntax and Python does not seem to be able to recognize it either.
put test is for creating index in Elasticsearch. So if you have kibana install then you can goto dev console and execute it. otherwise you can use same with curl command as well. If you have exsitig index then you can give your index name as well insted of test.
This will show how to use curl command for index creation.
This will show how to use python for creating index.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
