'How to print arguments that i sent to a Airflow-EMR cluster?

I am executing EMR (spark-submit) through airflow 2.0 and I am submitting steps as follows:

My s3://dbook/ buckets all files needed for spark-submit, first I am copying all files to EMr(Copy S3 to EMR) and then executing the spark-submit command, but I am getting an error called "no module named config". I need to know what args is being sent to EMR clsuter. How to achieve this?

SPARK_STEPS = [
    {
        'Name': 'Copy S3 to EMR',
        "ActionOnFailure": "CANCEL_AND_WAIT",
        'HadoopJarStep': {
            "Jar": "command-runner.jar",
            "Args": ['aws' ,'s3', 'cp' ,'s3://dbook/', '.', '--recursive'],
        },
    },
    {
        'Name': 'Spark-Submit Command',
        "ActionOnFailure": "CANCEL_AND_WAIT",
        'HadoopJarStep': {
            "Jar": "command-runner.jar",
            "Args": [
                'spark-submit', 
                '--py-files', 
                'config.zip,jobs.zip', 
                'main.py'],
        },
    }

]

Thanks, Xi



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source