'suggestions on fulltext search or already existing search algorithms
Can someone suggest how to solve the below search problem easily, I mean is there any algorithm, or full text search will be suffice for this?
There is below classification of items data,
| ItemCategory | ItemCluster | ItemSubCluster | SubCluster | Items |
|---|---|---|---|---|
| Vegetable | Root vegetables | Root | WithOutSkin | potato, sweet potato, yam |
| Vegetable | Root vegetables | Root | WithSkin | onion, garlic, shallot |
| Vegetable | Greens | Leafy green | Leaf | lettuce, spinach, silverbeet |
| Vegetable | Greens | Cruciferous | Flower | cabbage, cauliflower, Brussels sprouts, broccoli |
| Vegetable | Greens | Edible plant stem | Stem | celery, asparagus |
The inputs will be some thing like,
sweet potato, yam
Yam, Potato
garlik, onion
lettuce, spinach, silverbeet
lettuce, silverbeet
lettuce, silverbeet, spinach
From the input, I want to get the mapping of the input items those belongs to which ItemCategory, ItemCluster, ItemSubCluster, SubCluster.
Any help will be much appreciated.
Solution 1:[1]
You are nearly following the right approach.
You don't need full text searching here.
What you can create here is a kind of inverted index as follows:
If we take example of potato, create a map for potato storing what is its ItemCategory, ItemCluster, ItemSubCluster, SubCluster.
For example -
"potato": {
"ItemCategory": "Vegetable",
"ItemCluster": "Root vegetables",
"ItemSubcluster": "Root",
"Subcluster": "Without Skin"
}
Now, to store this kind of data for each vegetable would be expensive.
You can optimise the storage by using an encoding scheme:
For example -
let ItemCategory be denoted by 0,
let ItemCluster be denoted by 1,
let ItemSubcluster be denoted by 2,
let Subcluster be denoted by 3
and the values be denoted by a similar encoding scheme:
let Vegetable be denoted by 0,
let Root vegetables be denoted by 1,
let Root be denoted by 2,
let Without Skin be denoted by 3
Now, your mapping becomes:
"potato": {
"0": "0",
"1": "1",
"2": "2",
"3": "3",
}
To further optimise this, you can also make maintain an index of vegetables. For example, potato can be denoted by 0.
So your final index becomes:
"0": {
"0": "0",
"1": "1",
"2": "2",
"3": "3",
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
