'How to initiate and handle Bucket Pattern?
I am trying to use Bucket Pattern design for my DB collection following the concept of this MongoDB post. I wonder which is a smart and usual way to handle the first document when a collection is empty.
For Example, when we create collection named UserBookMarks each bucket allows to store 50 bookmarks. At the first time a user adds a bookmark there is no document to push to array. And for the 51st time a user adds a bookmark, how can I know the previous bucket is full and create a new one?
Here is some scenario I am thinking to do . When user add bookmark I will go query to find user bucket which count < 50 then I push to that bucket . If bucket is not exist then I create and push. But it will cost one query to find data . Is there any built in utilize of MongoDB support us on this type of design pattern.
Solution 1:[1]
Foremost, learning patterns it's imperative to learn them in context of usecases where these patterns are appropriate.
From the page refereed in the question
The Bucket Pattern
With data coming in as a stream
I have hard time imagining UserBookMarks coming as a stream, but let's put it aside assuming there is such stream.
The more important part of the pattern tho, it is intended for optimisation, i.e. the document contains bucket boundaries, in the example from the same page:
start_date: ISODate("2019-01-31T10:00:00.000Z"),
end_date: ISODate("2019-01-31T10:59:59.000Z"),
or "a day" if you translate it to English. The point is, the measurements bucket is not limited by number of documents, but timestamps of the documents within the bucket. The idea is you don't index timestamps for all measurements, but only boundaries and save on RAM to keep shorter index. It is not uncommon to combine it with pre-aggregation to calculate some stats on documents in the bucket at write time.
Setting boundaries of the bucket by number of documents in the bucket defeats this purpose.
The answer
Assuming the usecase from the article, the "smart and usual way" is to use upserts:
db.collection.updateOne(
{ start_date: ISODate("2019-01-31T10:00:00.000Z"),
end_date: ISODate("2019-01-31T10:59:59.000Z")
},
{
$push: { measurements: {
timestamp: ISODate("2019-01-31T10:42:00.000Z"),
temperature: 42
} },
$setOnInsert: { measurements: [ {
timestamp: ISODate("2019-01-31T10:42:00.000Z"),
temperature: 42
}] }
},
{ upsert: true }
)
The boundaries start_date and end_date are calculated on the application level from the measurement date ISODate("2019-01-31T10:42:00.000Z").
If a document with these boundaries already exists in the collection, the command will $push new measurement to the bucket. Otherwise a new documnet will be added to the collection with initial measurements array defined in $setOnInsert.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Alex Blex |
