'`yarn bin` not working on Github actions?

yarn bin can usually be used to return to path to a binary installed locally.

It seems like it doesn't work once it run on a Github action. Is there anything else in this environment that throw offs Yarn path detection?

The command I'm running is yarn node --experimental-vm-modules $(yarn bin jest) (works locally, but fails on Github.) That's the command recommended by Jest to run tests on native esm modules.

I have an example failed run here you can see

$ yarn node --experimental-vm-modules $(yarn bin jest)
node:internal/modules/cjs/loader:936
  throw err;
  ^

Error: Cannot find module '/home/runner/work/Inquirer.js/Inquirer.js//home/runner/work/Inquirer.js/Inquirer.js/node_modules/.bin/jest'
    at Function.Module._resolveFilename (node:internal/modules/cjs/loader:933:15)
    at Function.Module._load (node:internal/modules/cjs/loader:778:27)
    at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:77:12)
    at node:internal/main/run_main_module:17:47 {
  code: 'MODULE_NOT_FOUND',
  requireStack: []
}
error Command failed.

The weird thing is that the path seems to include the home directory 2 times /home/runner/work/Inquirer.js/Inquirer.js//home/runner/work/Inquirer.js/Inquirer.js/node_modules/.bin/jest - it's likely related to why it fails, but I can't figure out how it ends up like this 🤔



Solution 1:[1]

There's no specific number.

For a rough estimate, from the Connect API, tasks.max is the only one above that is configurable that matters. Each task would start a set of consumer/producer instances, which only communicate with the leader partition.

Internally to the framework, there's data being produced and consumed between the Connect status, offsets, and config topics. By default, few of those have up to 50 partitions, meaning one connection for each.

After data reaches the leader partition, then it's replicated, per your factor, within the cluster (still over TCP).

Some source connectors may additionally create an AdminClient connection in order to create topics ahead of the writing the data.

Other connectors may use multiple topics for errors.tolerance dead-letter-queue, or more specific ones like confluent.license.topic, or Debezium's database history topic, or MirrorMaker2 heartbeat topic.

If you're using Confluent Schema Registry, then that also maintains a _schemas topic.

Then finally, Sink consumers will be writing to __consumer_offsets topic.


For some of these, increasing internal client configs, such as consumer max.poll.records or producer batch.size, will reduce the frequency of connections made, at the expense of potentially dropping/duplicating data during errors/rebalance

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 OneCricketeer