'Correct Approach For Airflow DAG Project

I am trying to see if Airflow is the right tool for some functionality I need for my project. We are trying to use it as a scheduler for running a sequence of jobs that start at a particular time (or possibly on demand).

  1. The first "task" is to query the database for the list of job id's to sequence through.
  2. For each job in the sequence send a REST request to start the job
  3. Wait until job completes or fails (via REST call or DB query)
  4. Go to next job in sequence.

I am looking for recommendations on how to break down the functionality discussed above into an airflow DAG. So far my approach would :

  • create a Hook for the database and another for the REST server.
  • create a custom operator that handles the start and monitoring of the "job" (steps 2 and 3)
  • use a sensor to poll handle waiting for job to complete

Thanks



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source