'Transfer large Datasets to Azure blob storage from Amazon s3
I want to transfer large datasets from Amazon s3 to Azure Blob storage. Can anyone help me with how can I modify my code to handle large datasets? Below is my code in java:
try {
storageAccount = new CloudStorageAccount(new StorageCredentialsAccountAndKey(azureCredentialsDto.getStorageAccountName(), azureCredentialsDto.getStorageAccountKey()), true);
blobClient = storageAccount.createCloudBlobClient();
container = blobClient.getContainerReference(azureCredentialsDto.getBlobContainerName());
log.info("Creating Container: "+container.getName());
container.createIfNotExists(BlobContainerPublicAccessType.CONTAINER, new BlobRequestOptions(), new OperationContext());
CloudBlockBlob blob = container.getBlockBlobReference(destinationFileName);
URI blockbloburl = blob.getUri();
log.info("Blob URI: " + blockbloburl);
// sourcefileurl is url of amazons3 file I want to copy
blob.startCopy(new URI(sourceFileUrl));
log.info("Copy Started...");
Solution 1:[1]
If you just need to transfer the files with large size the best option is to use Copy activity in Azure Data Factory (ADF).
AzCopy is a command-line utility that you can use to copy blobs or files to or from a storage account. AzCopy v10 (Preview) now supports Amazon Web Services (AWS) S3 as a data source. You can now copy an entire AWS S3 bucket, or even multiple buckets, to Azure Blob Storage using AzCopy.
AzCopy and ADF are the two best approach when we need to move large size files.
To use AzCopy refer Move your data from AWS S3 to Azure Storage using AzCopy
To accomplish it using ADF refer the below links:
https://www.youtube.com/watch?v=9uXDt0DP9cs&ab_channel=TechBrothersIT
Azure Data Factory V2 Pipelines for Copying Large AWS S3 Buckets to Azure Storage
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | UtkarshPal-MT |
