'AzureML data sharing between pipelines
Moving my first steps in AML...
I am trying to create several pipelines, the idea s that some of the data generated by one pipeline will eventually be used by other pipelines. The way that I am going this is as follows:
- In the first pipeline, I am registering the data that I want to use later on as datasets by
dir = OutputFileDatasetConfig(<<name>>).read_delimited_files().register_on_complete(<<ds_name>>)
- I am saving data normally (data is numpy arrays)
np.savetxt(os.path.join(<<dir>>, <<file>>), X_test, delimiter=",")
- In the second pipeline I am reading the location of the data
dir = Run.get_context().input_datasets[<<ds_name>>].download()
and then loading it in numpy
a = np.loadtxt(dir[0])
Not sure if there are better ways to achieve this, and ideas, please?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
