Category "airflow"

In Airflow UI under connections, what does 'encrypted' and 'extra-encrypted' mean?

Simple question here. When looking at Airflow's UI, like in the screenshot below: What does Is Encrypted and Is Extra Encrypted stand for? Is there clear docum

Airflow 2.3 - Dynamic Task Mapping using Operators

I've got a current implementation of some code which works fine, but only carries out a single check per dag run as I cannot feed through multiple results to do

Using airflow to uploade data on S3

I tried to upload a dataframe containing informations about apple stock (using their api) as csv on s3 using airflow and pythonoperator. The script is below. Wh

Apache Airflow: How to template Volumes in DockerOperator using Jinja Templating

I want to convey list of volumes into DockerOperator using Jinja template: hard coded volumes works fine: volumes=['first:/dest', 'second:/sec_destination'] ho

loop over airflow variables issue question

I am having hard time looping over an airflow variable in my script so I have a requirement to list all files prefixed by string in a bucket. next loop throug

How to extract the query result from a Hive job output logs using DataprocHiveOperator?

I am trying to build a data migration pipeline using Airflow, source being a Hive table on a Dataproc cluster and the destination is BigQuery. I'm using Datapro

Airflow + sqlalchemy short-lived connections to metadata db

I deployed the latest airflow on a centos 7.5 vm and updated sql_alchemy_conn and result_backend to postgres databases on a postgresql instance and designated m

sqlite3 raised an error after running Airflow command line

When I ran command: airflow list_users It raised an error as below: sqlite3.OperationalError: no such table: ab_permission_view_role ... sqlalchemy.exc.Opera

How do I create a chain for data with parent child relationship using python?

If I have this set of input to convert, Input: Task A -> Task B Task A -> Task C Task B -> Task D Task C -> Task E Making use of pandas python: df

Read and group json files by date element using pyspark

I have multiple JSON files (10 TB ~) on a S3 bucket, and I need to organize these files by a date element present in every json document. What I think that my c

Amazon Managed Airflow (MWAA) import custom plugins

I'm setting up an AWS MWAA instance and I have a problem with import custom plugins. My local project structure looks like this: airflow-project ├─&

Airflow - call a operator inside a function

I'm trying to call a python operator which is inside a function using another python operator. Seems something I missed, can someone help me to find out what I

DAG run as per timezone

I want to run my dag as per new york time zone. As the data comes as per the new york time zone and dag fails for the initial runs and skips last runs data as w

Airflow metrics with prometheus and grafana

any one knows how to send metrics from airflow to prometheus, I'm not finding much documents about it, I tried the airflow operator metrics on Grafana but it d

Maximum memory size for an XCOM in Airflow

I was wondering if there is any memory size limit for an XCOM in airflow ?

Airflow: get xcom from previous dag run

I am writing a sensor which scan s3 files for fix period of time and add the list of new files arrived at that period to xcom for next task. For that, I am tryi

Airflow Installation issue| setproctitle

I Have followed the below steps for airflow installation and have successfully installed but am unable to run any commands which start with airflow, even to che

MWAA Airflow 2.0 in AWS Snowflake connection not showing

Snowflake is not showing in the connections dropdown. I am using MWAA 2.0 and the providers are already in the requirements.txt MWAA uses python 3.7 dont know i

Is there any way to can install airflow on kubernetes on premise

I am new to airflow and need assistance on how to install airflow on k8s . Needs are: 1 . How to Build docker image of airflow only for webserver and scheduler

MWAA - Tracking/monitoring progress of the "Updating" phase triggered by changing the version of requirements.txt used by an environment

I am working with Amazon Managed Workflows for Apache Airflow (MWAA). When I copy a new requirements.txt file to my S3 bucket, then use the AWS Console to speci