'Add empty buckets in mongo aggregation

I have the following aggregation:

const buckets = await StatisticModel.aggregate([
  {
    $bucket: {
      groupBy: '$ranking',
      boundaries: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11],
    },
  },
])

Which returns the following object:

[
  { _id: 3, count: 6  },
  { _id: 4, count: 98 },
  { _id: 5, count: 81 },
  { _id: 6, count: 25 },
  { _id: 7, count: 4 }
]

How can I add the missing (empty) buckets?

This is a simple example but I have more complexe ones where I generate the boundaries and I want to return to the front-end all the buckets and not only the filled ones.



Solution 1:[1]

You can use below aggregation

db.collection.aggregate([
  { "$facet": {
    "data": [
      { "$bucket": {
        "groupBy": "$ranking",
        "boundaries": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11]
      }}
    ]
  }},
  { "$addFields": {
    "data": {
      "$map": {
        "input": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11],
        "as": "i",
        "in": {
          "_id": "$$i",
          "count": {
            "$cond": [
              { "$eq": [{ "$indexOfArray": ["$data._id", "$$i"] }, -1] },
              0,
              { "$arrayElemAt": ["$data.count", { "$indexOfArray": ["$data._id", "$$i"] }] }
            ]
          }
        }
      }
    }
  }},
  { "$unwind": "$data" },
  { "$replaceRoot": { "newRoot": "$data" }}
])

But better to do with javascript

const array = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11]
const array2 = [
  { "_id": 3, "count": 6 },
  { "_id": 4, "count": 98 },
  { "_id": 5, "count": 81 },
  { "_id": 6, "count": 25 },
  { "_id": 7, "count": 4 }
]

array.map((ar) => {
  const index = array2.map((e) => { return e._id }).indexOf(ar)
  if (index === -1) {
    array2.push({ _id: ar, count: 0 })
  }
})

console.log(array2)

Solution 2:[2]

Starting in Mongo 5.1, it's a perfect use case for the new $densify aggregation operator:

// { ranking: 3, count: 6 }
// { ranking: 4, count: 98 }
// { ranking: 6, count: 25 }
// { ranking: 7, count: 4 }
db.collection.aggregate([
  { $densify: {
    field: "ranking",
    range: { step: 1, bounds: [0, 12] }
  }},
  { $set: { count: { $cond: [ { $not: ["$count"] }, 0, "$count" ] } } }
])
// { ranking: 0,  count: 0 } <=
// { ranking: 1,  count: 0 } <=
// { ranking: 2,  count: 0 } <=
// { ranking: 3,  count: 6 }
// { ranking: 4,  count: 98 }
// { ranking: 5,  count: 0 } <=
// { ranking: 6,  count: 25 }
// { ranking: 7,  count: 4 }
// { ranking: 8,  count: 0 } <=
// { ranking: 9,  count: 0 } <=
// { ranking: 10, count: 0 } <=
// { ranking: 11, count: 0 } <=

This:

  • densifies documents ($densify) by creating new documents in a sequence of documents where certain values for a field (in our case field: "ranking") are missing:
    • the step for our densification is 1: range: { step: 1, ... } since our buckets are following each other with a size of 1.
    • and we densify within the range [0, 12]: bounds: [0, 12]
  • finally sets ($set) count to 0 only for new documents included during the densify stage ({ count: { $cond: [ { $not: ["$count"] }, 0, "$count" ] } })

Note that I'm assuming your buckets are of equal size (assuming the missing 10 in your list is an oversight).

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Ashh
Solution 2