'How to insert file manipulation during copy activity in Data factory

I am using Data factory to copy collection from Mongo Atlas to ADLS Gen 2. By default data factory will create one json file per collection. But that leaves me with one huge json file.

I checked data flows and transformation but they work on file that is already present in ADLS. Is there a way I can split the data as it comes in to ADLS rather than first getting a huge file and then post processing and splitting it into smaller files?

If the collection size is 5GB, is it possible for data factory to split them in chunks of 100MB as the copy runs?



Solution 1:[1]

I would suggest you to use Partitioning as Partition option in Sink. As shown in below screenshot.

enter image description here

Refer - https://docs.microsoft.com/en-us/azure/data-factory/concepts-data-flow-performance#optimize-tab

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 AbhishekKhandave-MT