'Azure Search Normalized Lowercase Field
I am unable to add a normalized copy of the "Title" field to our search index. Ultimately, I'm trying to use this field for case-insensitive order by. Currently, titles are returned in the following order (with $orderBy=TitleCaseInsensitive):
- Abc
- Bbc
- abc
And instead I want: Abc->abc->Bbc. I have forked the "Title" field out into two fields via a Field Mapping and am then applying a Custom Analyzer with the "lowercase" tokenFilter, to the normalized field. Can someone explain why I am not getting the desired results? Here is the relevant portion of the index definition:
"index":{
"name": "current-local-inventory",
"fields": [
{"name": "TitleCaseInsensitive","indexAnalyzer":"caseInsensitiveAnalyzer","searchAnalyzer":"keyword", "type": "Edm.String","filterable": false, "sortable": true, "facetable": false, "searchable": true},
{"name": "Title", "type": "Edm.String","filterable": true, "sortable": true, "facetable": false, "searchable": true},
],
"analyzers": [
{
"@odata.type":"#Microsoft.Azure.Search.CustomAnalyzer",
"name":"caseInsensitiveAnalyzer",
"charFilters":[],
"tokenizer":"keyword_v2",
"tokenFilters":["lowercase"]
}
]
},
"indexers":[{
"fieldMappings" : [
{"sourceFieldName" : "Title", "targetFieldName" : "Title" },
{"sourceFieldName" : "Title", "targetFieldName" : "TitleCaseInsensitive" }
]
}]
Solution 1:[1]
See my answer in the related post Azure Search - Accent insensitive analyzer not working when sorting. When you include the lowercase token filter it only affects search and not sorting. See Azure Search User Voice entry Case-insensitive sorting for string fields
My suggested workaround as I explain in the related post is to create a forked/shadow property. However, using an analyzer with a lowercase token filter won't help. The only way I could get your example working was to include a copy of your Title property that was already lowercased. Notice that I don't use fieldMapping and I don't use different analyzers for indexing and search like you have in your example.
CREATE INDEX
Create the index. Replace variables wrapped in angle brackets as suitable for your env.
{
"@odata.context": "https://{{SEARCH_SVC}}.{{DNS_SUFFIX}}/$metadata#indexes/$entity",
"@odata.etag": "\"0x8D8761DCBBCCD00\"",
"name": "{{INDEX_NAME}}",
"defaultScoringProfile": null,
"fields": [
{"name": "Id", "type": "Edm.String", "searchable": false, "filterable": true, "retrievable": true, "sortable": true, "facetable": false, "key": true, "indexAnalyzer": null, "searchAnalyzer": null, "analyzer": null, "synonymMaps": [] },
{"name": "TitleCaseInsensitive","indexAnalyzer": null, "searchAnalyzer": null, "analyzer": "caseInsensitiveAnalyzer", "type": "Edm.String","filterable": false, "sortable": true, "facetable": false, "searchable": true},
{"name": "Title", "type": "Edm.String","filterable": true, "sortable": true, "facetable": false, "searchable": true}
],
"scoringProfiles": [],
"corsOptions": null,
"suggesters": [],
"analyzers": [ {
"@odata.type":"#Microsoft.Azure.Search.CustomAnalyzer",
"name":"caseInsensitiveAnalyzer",
"charFilters":[],
"tokenizer":"keyword_v2",
"tokenFilters":["lowercase"]
}],
"tokenizers": [],
"tokenFilters": [],
"charFilters": [],
"encryptionKey": null
}
UPLOAD
Upload three sample documents.
{
"value": [
{
"@search.action": "mergeOrUpload",
"Id": "1",
"Title": "Abc",
"TitleCaseInsensitive": "abc"
},
{
"@search.action": "mergeOrUpload",
"Id": "2",
"Title": "abc",
"TitleCaseInsensitive": "abc"
},
{
"@search.action": "mergeOrUpload",
"Id": "3",
"Title": "Bbc",
"TitleCaseInsensitive": "bbc"
}
]
}
QUERY
Then, query with $orderby on your lowercased (normalized) property.
And you'll get the expected results where Title is sorted in a case-insensitive way.
{
"@odata.context": "https://<your-search-service>.search.windows.net/indexes('dg-test-65526118')/$metadata#docs(*)",
"@odata.count": 3,
"value": [
{
"@search.score": 1.0,
"Id": "2",
"TitleCaseInsensitive": "abc",
"Title": "abc"
},
{
"@search.score": 1.0,
"Id": "1",
"TitleCaseInsensitive": "abc",
"Title": "Abc"
},
{
"@search.score": 1.0,
"Id": "3",
"TitleCaseInsensitive": "bbc",
"Title": "Bbc"
}
]
}
I would love to be corrected with a simple way to accomplish this.
Solution 2:[2]
Please check out the Text normalization for case-insensitive filtering, faceting and sorting feature that's in Preview.
You can update your index to use this "normalizer" feature for the fields in which you'd like case-insensitive order-by operations.
You don't need a separate field TitleCaseInsensitive anymore. You can add "normalizer": "lowercase" to the Title field, and $orderBy=Title will sort in the order you'd like, ignoring casing.
The "lowercase" normalizer is pre-defined. If you'd like other filters to be applied, please look at predefined and custom normalizers
"index": {
"name": "current-local-inventory",
"fields": [
{"name": "Title", "type": "Edm.String", "filterable": true, "sortable": true, "facetable": false, "searchable": true, "normalizer":"lowercase"}
]
},
"indexers":[{
"fieldMappings" : [
{"sourceFieldName" : "Title", "targetFieldName" : "Title" }
]
}]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Robert - MSFT |
| Solution 2 |
