'Customize Airflow / TFX Pipeline to upload Files in S3
I am currently trying to customize a tfx pipeline, which is being orchestrated by Airflow.
Basically I want to upload all generated files by the DAG to a S3 bucket storage after the run of the pipeline.
The issue is that every tfx pipeline component gets parsed into a individual DAG, because of this I can not access these via abc.downstream("xy")
By my understanding, see this documentation for how the pipeline gets build dag-runner.
My code is based on the tfx-taxi-example.
Maybe I have not fully grasped the concepts of the TFX pipeline yet, so please make me aware if I am trying to solve this in a overly complex way. I would greatly appreciate an alternative approach to what I am trying to accomplish ;)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
