'How to create OneToMany references between documents already inserted in MongoDB
It is the first time I am using MongoDB and I don't know how to create references (relationships in SQL) between documents which are already inserted in my database.
I have two collections: the first one is called Films, and its documents have information about films (title, a unique URL, description...). Here is an example of a document:
{
"_id": {
"$oid": "6272aa886441c51b18de7b23"
},
"type": "Film",
"title": "Les 2 papas et la maman",
"genres": "Comedia",
"description": "Jérôme and Delphine want a child but Jerome is sterile. They then ask the best friend of Jerome, Salim, to be the donor for artificial insemination of the mother...",
"platform": "Netflix"
"film_url": "exampleurl.com"
}
Also, there is another collection called "Actors". Every document in the "Actors" collection has specific information about a certain Actor, the film in which he/she participates (title and a unique URL) and the character that he/she represents. A document of this collection could be the next one:
{
"_id": {
"$oid": "6272ac5b6441c51b18de9ee4"
},
"name": "Sophie Rundle",
"film_title": "Peaky Blinders",
"film_url": "exampleurl.com",
"character": "Ada Shelby",
"num_episodes": "36"
}
I want to create a OneToMany reference between the Collection Films and the collection Actors (a film has many actors, and one actor represents a character in a film), creating an array inside each Film document, which contains the ids of those actors who participate in a certain film. To do that, I have the unique field "film_url" in both collections and I have two CSV files with the data, so I could read and iterate over them to create the references, but it isn't a good idea in terms of efficiency, since each file has more than 10,000 lines.
Is there a simpler and more efficient way to create these references in MongoDB?
Solution 1:[1]
Here's one way you could possibly add and populate an "actorIds" array in the Films collection. If your collections are large, I expect this will take some time since each film will need to get its actors. Indexes will probably be useful.
Before using this, I would evaluate MongoDB's warning about "output to the same collection that is being aggregated". Infinite loops are not fun, but I doubt that would be an issue with this pipeline. You should make your own decision about using this - it's your responsibility (not mine!). Expert analysis/opinions are welcome.
db.Films.aggregate([
{
"$lookup": {
"from": "Actors",
"localField": "film_url",
"foreignField": "film_url",
"pipeline": [
{
"$project": {
"_id": 1
}
}
],
"as": "actorIds"
}
},
{
"$set": {
"actorIds": {
"$map": {
"input": "$actorIds",
"in": "$$this._id"
}
}
}
},
{
"$merge": "Films"
}
])
Try it on mongoplayground.net.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
