'Multiple search on array on ElasticSearch keeping order of elements
I'm trying to do a multiple search on multiple arrays on Elasticsearch. I'm using NEST on C#.
Each array has the same length and is mapped to an ElasticSearch document. For example, each documents contains 5 fields: h.1, h.2, h.3, h.4, h.5.
Each one of this contains a number, so for example I have a document like this:
h.1: 2240 h.2: 7634 h.3: 1432 h.4: 75675 h.5: 3435
I need to search a list of arrays and get the top scores.
For example, I have a list of arrays to search like:
h.1: 2240 h.2: 6867 h.3: 1432 h.4: 453 h.5: 3435
h.1: 21423 h.2: 7634 h.3: 1432 h.4: 63445 h.5: 3435
h.1: 6457 h.2: 678 h.3: 546745 h.4: 423 h.5: 534534
h.1: 45346 h.2: 86789 h.3: 1432 h.4: 234523 h.5: 3435
h.1: 56457 h.2: 678 h.3: 634 h.4: 75675 h.5: 65476575
And I need to find the most similar document to any of the array in the list. So the document which I'm looking for is the one with the highest number of corresponding indexes/values to the array.
At the moment I'm searching for each element in the list like this:
foreach (var hashes in list)
{
var terms = hashes
.Select((h, index) =>
{
string stringValue;
if (h >= 0) stringValue = h.ToString();
else stringValue = "\\" + h;
return $"{fieldName}.{index}:{stringValue}";
});
string queryString = string.Join(" ", terms);
var response = _esClient.Search<MyDocument>(s => s
.Explain()
.Query(q => q
.QueryString(q => q
.DefaultField(f => f.Hashes.Values)
.Query(queryString)))
.Size(5));
}
And then I elaborate the results on the client, by the way this method is very slow because of the many requests that are being made to ElasticSearch.
I'd like to know if there is way to do a multiple search using a single request, getting only the documents which are the most similar to the arrays in the list.
Any ideas?
I hope I have been able to explain what is my aim, by the way I know it's a bit complex.
Solution 1:[1]
Sounds like a constant score query might give you what you want.
Here's an example (sorry no C#) with two fields, you can add others. The score should be the number of fields that match.
GET /_search
{
"query": {
"bool": {
"should": [
{
"constant_score": {
"filter": {
"term": {
"h.1": "2240"
}
},
"boost": 1.0
}
},
{
"constant_score": {
"filter": {
"term": {
"h.2": "6867"
}
},
"boost": 1.0
}
}
]
}
}
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | curiousity |
