'Error Loading Delimited file into MySQL using Airflow( Error code 2068)

I have airflow installed on Ubuntu as WSL on windows.

I am trying to load a delimited file that is stored on my C drive into Mysql database using the code below:

import logging
import os
import csv

from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from airflow.operators.mysql_operator import MySqlOperator
from airflow.hooks.mysql_hook import MySqlHook

def bulk_load_sql(table_name, **kwargs):
    local_filepath = 'some c drive path'
    conn = MySqlHook(conn_name_attr='mysql_default')
    conn.bulk_load(table_name, local_filepath)
    return table_name

dag = DAG(
        "dag_name",
        start_date=datetime.datetime.now() - datetime.timedelta(days=1),
        schedule_interval=None)

t1 = PythonOperator(
        task_id='csv_to_stgtbl',
        provide_context=True,
        python_callable=bulk_load_sql,
        op_kwargs={'table_name': 'mysqltablnm'},
        dag=dag
        )

It gives the following exception:

MySQLdb._exceptions.OperationalError: (2068, 'LOAD DATA LOCAL INFILE file request rejected due to restrictions on access.')

I have checked the following setting on mysql and its ON SHOW GLOBAL VARIABLES LIKE 'local_infile'

Could someone please provide some pointers as to how to fix it. Is there any other way I can load a delimited file into mysql using airflow.



Solution 1:[1]

For now, I have implemented a work around as follows:

def load_staging():
        mysqlHook = MySqlHook(conn_name_attr='mysql_default')
        #cursor = conn.cursor()
        conn = mysqlHook.get_conn()
        cursor = conn.cursor()
        csv_data = csv.reader(open('c drive file path'))
        header = next(csv_data)
        logging.info('Importing the CSV Files')
        for row in csv_data:
                 #print(row)
                 cursor.execute("INSERT INTO table_name (col1,col2,col3) VALUES (%s, %s, %s)", 
            row)
        conn.commit()
        cursor.close()

t1 = PythonOperator(
        task_id='csv_to_stgtbl',
        python_callable=load_staging,
        dag=dag
        )

However, it would have been great if the LOAD DATA LOCAL INFILE would have worked.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Aragon