'Special trie for autocomplete sentences with repeated words?
Assume that there is a finite alphabet of letters like " ", "a", "b".
Assume there is next a finite set of finite words (without space) can be formed like "abba", "baba", "aba".
Then assume there is a finite set of finite sentences in that language, like "aba abba baba", "abba abba aba", with a maximum number of words per sentence.
How can this set of sentences be space-efficiently stored e.g. for ranked autocomplete (finding 10 highest scored sentences starting with a prefix).
A simple forward trie as an example could be build by adding all sentences, but it seems such a trie would waste space by repeating comparisons for each word wherever it occurs.
Wikipedia mentions radix trees, hashmaps and ternary trees as alternatives to tries: https://en.wikipedia.org/wiki/Ternary_search_tree, https://en.wikipedia.org/wiki/Radix_tree
But I imagine it should be possible to have a trie just for all single words, and extend that in such a way that after matching any word, there is a link back to the root of the note that depends on what words have already been traversed. This could also be a separate trie using words to store sentences the same way the initial try uses letters to store words.
Is there such a datastructure?
(related questions: Implemet trie in Java in a memory efficient manner, Trie saves space, but how?)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
