'Are there any Datastructure and Database feature that could handle tag of tags?

I want to categorize data to search for content that can be tagged. However I want a tag to have relationship with other tag too and so it could be searched for related tags

For example. I might have a database for metadata of pictures of living things and so I would like to tag it by species or even breed of animal. Such as an image of a German Shepherd dog I would tag it as German Shepherd

But then if someone search for Dog or Canine it would include this picture in the search result too. Because I was make a relationship that German Shepherd is Dog and Dog is also Canine (and Canine is also Mammal and Animal and so on)

As we can see it was complex set and subset so I don't know any solution that was designed for this system



Solution 1:[1]

One solution is to devise a hierarchically directed semantic graph into a relational table form.

The tabular form would have to be an expanded form of the graph, containing every link from one term to another that has a directed path in the graph using two columns: specific and generic.

For example,

specific generic
poodle dog
poodle canine
poodle mammal
poodle animal
germanshepherd dog
germanshepherd canine
germanshepherd mammal
germanshepherd animal
dog canine
dog mammal
dog animal
canine mammal
canine animal
mammal animal

To go from high to low level hierarchy, a single query of generic term will get all the necessary specific terms to search the tags, and vice versa. You don't need to do recursive queries.

Of course you can always devise a table containing just the direct relationship between two terms:

specific generic
poodle dog
germanshepherd dog
dog canine
canine mammal
mammal animal

Then use an algorithm that does recursive queries to expand to find all directed paths in the graph. The algorithm can even be used to automatically generate the expanded table. This works for simple hierarchy, but if you have special cases in the hierarchy, the expanded form with tweaked exceptions would work better.

Solution 2:[2]

This kind of search is done very often in databases designed for text searching, like ElasticSearch, etc.

The type of index that these databases use is called an "inverted index", which is basically a highly compressed mapping from each search term to all the documents in which it appears. It is efficient for searching for many terms simultaneously.

Typically, when you search for text in a product like this, your search terms go through a process called "stemming", which finds root forms and alternate forms of each word and adds them as search terms. If you search for "solar", for example, then "sun" may be added as a root form. This is pretty much exactly what you want to do.

If you have your own mapping from tags to related tags, then these kinds of search indexes/products will let you do the kind of search you want just by adding all the related tags to the query.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Matt Timmermans