'AzureML data sharing between pipelines

Moving my first steps in AML...

I am trying to create several pipelines, the idea s that some of the data generated by one pipeline will eventually be used by other pipelines. The way that I am going this is as follows:

  • In the first pipeline, I am registering the data that I want to use later on as datasets by
dir = OutputFileDatasetConfig(<<name>>).read_delimited_files().register_on_complete(<<ds_name>>)
  • I am saving data normally (data is numpy arrays)
np.savetxt(os.path.join(<<dir>>, <<file>>), X_test, delimiter=",")
  • In the second pipeline I am reading the location of the data
dir = Run.get_context().input_datasets[<<ds_name>>].download()

and then loading it in numpy

a = np.loadtxt(dir[0])

Not sure if there are better ways to achieve this, and ideas, please?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source