'Searching with String-style within Text fields in Solar

I have a field containing short texts (a few tokens). I index it as Text rather than String because I need to search within the text.

However, I need to search with the String-style (matching the entire field).

For example, if a field is Google Search Engine. I currently find the row by searching "search engine". While preserving this behavior, I need another option to catch the row only if the search term is "google search engine".

I believe it is possible by regex, but it should be slow.

I wonder if there is a standard way to do so or if I need to add another field of the same content but with the String type.



Solution 1:[1]

Use multiple fields - the definition of the second field will differ based on whether you want the search to be case sensitive or not. If you're OK with having a case sensitive field (i.e. "Google" and "google" are different terms), then string is the correct choice.

If you want the field to be case insensitive, use a TextField with a KeywordTokenizer (which keeps the input as a single, large token) with a LowercaseFilter attached (which lowercases the content).

You can then search both fields by using qf - query fields - with the edismax/dismax query parses and score them differently. If you only need explicit searching (you choose whether you want to match the whole string or just words in it yourself), using the field name in the regular way would work.

Use a copyField instruction to index the same content into both fields without changing your indexing pipeline. You'll need to reindex your core / collection for the new field to get any values.

And no, you can't do this with a regex, since the regex is applied against the tokens. You already have the tokens split up into smaller parts, so /foo bar/ doesn't have a foo bar token to match against, just foo and bar - neither match the regex.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 MatsLindh