'Is it possible to dynamically create compute/clusters at runtime in Azure ML?

I'm looking to dynamically create compute clusters at runtime for an Azure ML pipeline.

A simplistic version of the pipeline looks like this:

 # create the compute
compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2', max_nodes=1)
cpu_cluster = ComputeTarget.create(ws, 'test-cluster', compute_config)
cpu_cluster.wait_for_completion(show_output=True)

# construct the step
step_1 = PythonScriptStep(script_name='test_script.py', name='test_step', compute_target=cpu_cluster
)

# validate the pipeline and publish
pipeline = Pipeline(ws, steps=[step_1])
pipeline.validate()

# run the experiment
experiment = Experiment(workspace=ws, name=experiment_name)
pipeline_run = experiment.submit(config=pipeline)
pipeline_run.wait_for_completion()

This works perfectly fine when I run the driver script locally however, when I publish the pipeline and execute from ADF, the compute clusters don't get created.

UserError: Response status code does not indicate success: 400 (Unknown compute target 'test-cluster'.). Unknown compute target 'test-cluster'.

Any guidance or suggestions welcome.



Solution 1:[1]

Yes, we can create dynamically. Check the below documentation for the procedure.

Document

  1. Downloads the project snapshot to the compute target from the Blob storage associated with the workspace.

  2. Builds a Docker image corresponding to each step in the pipeline.

  3. Downloads the Docker image for each step to the compute target from the container registry.

  4. Configures access to Dataset and OutputFileDatasetConfig objects. For as_mount() access mode, FUSE is used to provide virtual access. If mount isn't supported or if the user specified access as as_upload(), the data is instead copied to the compute target.

  5. Runs the step in the compute target specified in the step definition.

  6. Creates artifacts, such as logs, stdout and stderr, metrics, and output specified by the step. These artifacts are then uploaded and kept in the user's default datastore.

The above points are also available in the document link shared.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 SairamTadepalli-MT