'ElasticSearch - Can I use the join data type for many to one relationship cases (sql like)?
In Elasticsearch, I need to keep one to many relationship and have been thinking about Join Data Type. I know that denormalized documents are preferred and the way to go, however in my case the synchronization logic is not straighforward. My case is as follows:
- Parent is updated on a daily bases.
- Parent documents are lesser than child documents. One parent document can have up to 100.000 child documents.
- Child documents keep being added on a daily basis.
My index have about 20 million documents and 120 Gb.
Is it advisable to use Join data type ?
Solution 1:[1]
It is not recommndate to use as mentined in document:
We don’t recommend using multiple levels of relations to replicate a relational model. Each level of relation adds an overhead at query time in terms of memory and computation. For better search performance, denormalize your data instead.
Also, your parent and child documents should be in same shards and you need to use routing while indexing document in Elasticsearch.
It is required to index the lineage of a parent in the same shard so you must always route child documents using their greater parent id.
There are other limitation as well which is listed here.
I will suggest to do POC with your data and check performance and if performance is as per your expectation then you can go ahead with Join data type.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Sagar Patel |
