'How to trigger a pipeline ONLY when 4 different files are updated (overwrittened) in 4 folders in same container?
In Azure blob storage, I have a container where I have 4 files in 4 folders.
I would like to trigger an Azure Data Factory pipeline only when all these 4 files are overwritten (4 files with the same name are added with new data).
This pipeline needs all these 4 files so I want to trigger only when these 4 files are replaced by 4 new ones.
How can I do it ? I know that I have storage event to trigger pipeline in ADF but don't know how to do it with several files
---------- UPDATE BELOW --------
I found this explanation but really don't know how to do step 2 and 3 :
1) Setup a Storage event triggered pipeline on first destination i.e. container/folder1/file1.csv.
2)
Then after maybe waiting for few secs using WaitActivity use Get Metadata activity with Field list Argument as Child items to get list of files in the folder
or:
LookupActivity chain to look for files at container/folder2/file2.csv and container/folder3/file3.csv with file list path property.
3) Then you can hold the results in variables for convenience and using Conditional activities like IfActivity compare to see if all the files exist, if True you can proceed with further activities you plan to design in the pipeline when the three files arrived.
OR this explanation:
Create an ADF pipeline with a) Get meta data activity >> check whether there are 4 required files b) If yes, then use. Execute pipeline activity to trigger the pipeline that should be run when there are 4 files If not, ignore/throw error etc. Create Event triggers for the files and associate with the pipeline.
So in the case of fourth event trigger, all files would be found and then the main pipeline would be executed.
Solution 1:[1]
Use a metadata table to capture the filename, datetime of the file, 0 as isactive. This data can be pulled from the files placed in the blob container. So initially 4 files with 1900-01-01 dates would be present. Now whenever a new file is added to the folder perform a data entry to the metadata table. Now compare the date difference for the same filenames in a stored procedure and if the date difference to old vs new is more than 0 then update isactive as 1. Now inorder to run your ADF pipeline make sure all the 4 filenames have the isactive as 1.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Srishuk Kumar Bhagwat |
