'Time difference reading files from Blob-Storage-Container

We are using Blobfuse for "mounting" our blob-storage-container to an Azure virtual machine as well as to Azure ML Studio.
In our blob-storage-container there are around 400 files each about 1.5MB

With the Azure VM, the algorithm needs 45 seconds to read all files.
With Azure ML Studio, the same algorithm needs 5 minutes to read all files.

The Azure VM resource as well as the Azure ML Studio resource are in the same tenant.
These resources use two different computes but have the same specifications.

Why does it take so much longer to read all the files when using Azure ML Studio compared to Azure VM?
Is it possible to reduce the time needed for reading all files when using Azure ML Studio without changing the storage file hierarchy in any way?



Solution 1:[1]

It shouldn't take more time to read file with ML Studio comparatively with VM.

  1. Once verify the configuration of your authentication with storage account, which will lead for performance drop.

  2. Check E2E latency read and write latency as well

  3. Verify connection to your data in storage services on Azure with Azure Machine Learning datastores

  4. Azure Machine Learning requires additional configuration steps to communicate with a storage account that is behind a firewall or within a virtual network.

  5. If storage account is behind a firewall, you can add your client's IP address to an allow list via the Azure portal.

Reference links: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-connect-data-ui?tabs=credential https://docs.microsoft.com/en-us/azure/machine-learning/concept-optimize-data-processing

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 SureshBabu-MT