'Function fails to fully build JSON under load

I'm working on an OSS project called pSConfig Web Admin (PWA). The biggest problem we're having is that under load, the generated JSON will occasionally be delivered incomplete (as documented in this issue).

When it fails, it seems to always fail in the same way. Namely, the groups object only includes one item when it should contain more. For example, instead of getting this:

{
  "archives": { … },
  "addresses": { … },
  "groups": {
    "Group 1": { … },
    "Group 2": { … },
    "Group 3": { … },
    "Group 4": { … }
  },
  "tests": { … },
  "schedules": { … },
  "tasks": { … },
  "_meta": { … },
  "hosts": { … }
}

We'll get this:

{
  "archives": { … },
  "addresses": { … },
  "groups": {
    "Group 4": { … }
  },
  "tests": { … },
  "schedules": { … },
  "tasks": { … },
  "_meta": { … },
  "hosts": { … }
}

I think the problem is that some async calls are returning before they really should and I fear Zalgo has been unleashed since it doesn't happen all the time. I suspect the problem is in the exports._process_published_config function. Namely how it uses async.parallel inside async.eachSeries (but this is just speculation on my part):

exports._process_published_config = function (_config, opts, cb) {
    …

    async.eachSeries(
        _config.tests,
        function (test, next_test) {
            var type = test.mesh_type;

            if (!test.enabled) return next_test();
            async.parallel(
                [
                    function (next) {
                        //a group
                        if (!test.agroup) return next();
                        generate_group_members(
                            test,
                            test.agroup,
                            test_service_types,
                            type,
                            next,
                            "a-"
                        );
                    },
                    function (next) {
                        //b group
                        if (!test.bgroup) return next();
                        generate_group_members(
                            test,
                            test.bgroup,
                            test_service_types,
                            type,
                            next,
                            "b-"
                        );
                    },
                    function (next) {
                        if (!test.nahosts) return next();
                        resolve_hosts(test.nahosts, function (err, hosts) {
                            if (err) return next(err);
                            test.nahosts = hosts;
                            hosts.forEach(function (host) {
                                host_catalog[host._id] = host;
                            });
                            next();
                        });
                    },
                    function (next) {
                        //testspec
                        if (!test.testspec) return next();

                        resolve_testspec(test.testspec, function (err, row) {
                            if (err) return next(err);
                            test.testspec = row;

                            //suppress testspecs that does't meet min host version
                            if (!_config._host_version) return next();
                            var hostv = parseInt(_config._host_version[0]);
                            var minver =
                                config.meshconfig.minver[test.service_type];
                            for (var k in test.testspec.specs) {
                                //if minver is set for this testspec, make sure host version meets it
                                if (minver && k in minver) {
                                    if (hostv < minver[k])
                                        delete test.testspec.specs[k];
                                }
                            }
                            next();
                        });
                    },
                ],
                next_test
            );
        },
        function (err) {
            …
        }
    );
};

Has anyone seen something like this before or see anything in this code that stands out as completely wrong?

I'm trying to get out of callback hell by rewriting portions of this using promises and async/await, but I'm not entirely sure that'll solve this problem. In any case it's very difficult to rewrite due to how the callbacks are so deeply nested.



Solution 1:[1]

very likely that generate_group_members at first checks whether 'groups' array is empty or not, then does some async invocation and then initalizes 'groups' as [] and pushes there something. So two parallel invocations of generate_group_members can find 'groups' empty and after that initialize that as [].

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Andrey