'How to efficiently Shard a MongoDb collections that already has millions of documents?
I have a collection named order_error. Which has over 60 million documents. Today I was trying to shard it. I have 3 replica sets. Initially, no issues were there. The balancer was distributing the chunks among the clusters. But eventually, it has started to consume all Ram space and after all swap space too. Now everything is unresponsive. We can't follow this procedure in production. We need a better solution for that. How can I do the sharding in a better way? If someone could help me with that please let me know
Solution 1:[1]
When you insert documents into an empty collection, then initially all date will be written to the primary shard, so it will not solve your issue.
But you can use sh.splitAt on empty collection to pre-split the it.
Note, even if the collection is empty it will take some time till chunks are distributed over all shards! When you split a chunk, then it still remains on the current shard. Check with db.collection.getShardDistribution() whether chunks are evenly distributed.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Wernfried Domscheit |
