'Airflow + python logging module doesn't write to log file
Trying to write "hello world" into an airflow log (airflow 1.10.3). Based on the SO solutions presented here and here I should be able to just import logging and logging.info('hello world'). That doesn't seem to work for me.
import logging
from datetime import datetime, timedelta
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
default_args = {
'owner': 'benten',
'depends_on_past': False,
'start_date': datetime(2019, 7, 25),
'email_on_failure': False,
'retries': 1,
'retry_delay': timedelta(minutes=1),
}
def logging_is_fun():
logging.debug("hellow world")
logging.info("hello world")
logging.critical("hello world")
return None
with DAG('fun_logs', schedule_interval='45 * * * *', default_args=default_args) as dag:
log_task = PythonOperator(python_callable=logging_is_fun, task_id='log_test_task')
I trigger the dag manually and the task executes with no problems. But alas when I check the logs all I see is this:
*** Reading local file: /home/ubuntu/airflow/logs/fun_logs/log_test_task/2019-08-31T19:22:49.653712+00:00/1.log
Where are my amazing "hello world" statements? I don't expect to see all of them, given my log level setting. I do expect to see the critical message, though.
My airflow.cfg has the following in it (all default settings to the best of my knowledge):
# The folder where airflow should store its log files
# This path must be absolute
base_log_folder = /home/ubuntu/airflow/logs
# Airflow can store logs remotely in AWS S3, Google Cloud Storage or Elastic Search.
# Users must supply an Airflow connection id that provides access to the storage
# location. If remote_logging is set to true, see UPDATING.md for additional
# configuration requirements.
remote_logging = False
remote_log_conn_id =
remote_base_log_folder =
encrypt_s3_logs = False
# Logging level
logging_level = WARN
fab_logging_level = WARN
# Logging class
# Specify the class that will specify the logging configuration
# This class has to be on the python classpath
# logging_config_class = my.path.default_local_settings.LOGGING_CONFIG
logging_config_class =
# Log format
log_format = [%%(asctime)s] {%%(filename)s:%%(lineno)d} %%(levelname)s - %%(message)s
simple_log_format = %%(asctime)s %%(levelname)s - %%(message)s
# Log filename format
log_filename_template = {{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{ try_number }}.log
log_processor_filename_template = {{ filename }}.log
dag_processor_manager_log_location = /home/ubuntu/airflow/logs/dag_processor_manager/dag_processor_manager.log
Solution 1:[1]
you're just missing to run the task. Just add log_task.run() and your logs will be there for you.
Just test it and it worked, here's what the output looks like:
[2019-09-03 02:19:15,990] {main.py:18} CRITICAL - hello world
[2019-09-03 02:19:16,024] {main.py:17} INFO - hello world
[2019-09-03 02:19:16,024] {main.py:18} CRITICAL - hello world
[2019-09-03 02:19:16,060] {main.py:17} INFO - hello world
[2019-09-03 02:19:16,061] {main.py:18} CRITICAL - hello world
[2019-09-03 02:19:16,100] {main.py:17} INFO - hello world
[2019-09-03 02:19:16,100] {main.py:18} CRITICAL - hello world
...
...
Hope it helps.
Solution 2:[2]
Disclaimer: I'm still figuring this out too. But this is my understanding so far.
Airflow has multiple ways of logging. Tasks log to their own files, and not to stdout which the airflow services themselves do.
You will probably find your logs through the web ui:
- Go the dag
- Click a task instance
- Click [View Log]
(verified with your example dag on my machine, using my own conf: above steps show "hello world", but the terminal on stdout does not)
From what I've seen this is the only type of logs that are affected by the logging_level configuration, which – by the way – is INFO by default.
From the docs these logs are stored at {dag_id}/{task_id}/{execution_date}/{try_number}.log
Astronomer also has a guide here. But I haven't gotten around to trying it out.
Solution 3:[3]
I'm bumping an old thread here, but if someone is still struggling with this I have another solution. In my case the logging_level was already set to INFO and initializing logger with "airflow.task" did nothing. No logs were showing up. Instead what ended up working was clearing "default_impersonation" setting from airflow.cfg. You only need this setting if you're unable to use the 'run_as_user' option. Reference here https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#default-impersonation
After clearing the setting, logs showed up nicely.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Haroun Mohammedi |
| Solution 2 | AdamAL |
| Solution 3 | LTJ |
