'Jobs running on driver node databrick

I have notebook that runs a machine learning jobs in databrick, i'm using dbutils to accept variables and pass that to notebook.

I have created another notebook as parent and pass variables via this notebook and run multiple notebook with.

ThreadPoolExecutor, ProcessPoolExecutor
def processAnIntegerNumber(id):

dbutils.notebook.run(path = "/Users/child_notebook",
                                    timeout_seconds = 3600,
                                    arguments = {"id":id})

this will create multiple jobs, the problem that I have is all these 10-30 jobs that I pass as variable runs in Driver nodes and does not use worker nodes and as result it's extremely slow.

anyway to run python notebook in parallel, without using scala?

regards



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source