'Explanation: mongoose / mongo find() vs aggregate({ $match: {}})

All, this is less of a question and more trying to understand as how the internals of mongo (3.7.3) with mongoose (5.13.14) find() vs aggregate([{$match:{}]) in a nodejs / typescript project work because a colleague of mine and I spent a few hours yesterday afternoon stuck on something and I just want to make sure no-one else makes the same mistake. This is the situation. We has 2 collections represented as such in mongo:

  1. Load which has an internal _id (of type ObjectId()) and a name;
  2. RawData which has a "reference/fk" to the Load._id as "loadId" as a STRING and a few other fields;

Because of some business requirements, we had to use an aggregation pipeline to get some of the data from the "RawData" based on the Load._id and we wrote something like this.

getRawInvalidRecordsAggregate = async (loadId: string): Promise<any[]> => {
    const records = await RawDataModel.aggregate([
      { $match: { loadId: { $eq: loadId }, validationErrors: { $ne: [] } } },
      { $lookup: { from: constants.DB_COLLECTIONS.LOOKUP_COLLECTION, localField: 'uniqueId', foreignField: 'uniqueId', as: 'lookup' } },
      { $match: { lookup: { $ne: [] } } },
      { $project: { _id: 0, validationErrors: 1, uniqueId: '$uniqueId } }
    ]).exec();
    return records;
  };

But we would not get any data back. We replicated the pipeline in compass and it worked. So we tried a "find()" version of the query just to try to understand what we were missing and wrote this using the same query (without the $lookup and $project) using the exact same "loadId" as a parameter just to see if we could get some records back.

getRawInvalidRecordsFind = async (loadId: string): Promise<any[]> => {
    const records = await RawDataModel.find({ loadId: loadId, validationErrors: { $ne: [] } }).exec();
    return records;
  };

The find() by loadId worked perfectly. Puzzled. I know we should have run the process in debug mode earlier than we did to figure out the problem...

Anyway, because we retrieved the Load._id from the freshly inserted data into the Load table right before querying, we did NOT realize that at the time that it is an ObjectId()... not a string. However, we were thinking we were passing it as a string to both queries and it does not seem to be the case. The aggregate([{ $match: { loadId: { $eq: loadId }]) does not seem to work the same way as the find({loadId: loadId}). We had to CAST the load._id as string for the aggregation to work doing something like this.

const loadId = load._id.toString();

The better solution will be to make sure the loadId in the RawData collection is the same type of the original Load._id but for now if anyone could explain how the find() converts the load._id into a string under the hood, I would love to know where / how.

Thank you.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source