'AWS Glue - Job Monitoring: Job Execution, Active Executors and Maximum Needed Executors not showing

I have set up an ETL job in AWS Glue with the following settings: Glue v.3.0, Python v.3, Spark v.3.1 and Worker type G.1X with 10 Workers and Job metrics enabled.

When I'm looking at the job metrics after the job is finished, I see in the Job Execution: Active Executors, Completed Stages & Maximum Needed Executors that only the Number of Completed Stages is showing up. The spark session has been set up according to the code below.

Why can I not see the Metrics for Number of Active Executors and Number of Maximum Needed Executors? I will also include a snip of the Memory Profile and CPU Load in case that can help.

Thank you!!

#Glue context & spark session
## @params: [JOB_NAME]
args = getResolvedOptions(sys.argv, ['JOB_NAME'])
#Solves the issues with old datetime in the new version of Spark
spark_conf = SparkConf()
spark_conf.setAll([
    ('spark.sql.legacy.parquet.int96RebaseModeInRead', 'CORRECTED'), 
    ('spark.sql.legacy.parquet.int96RebaseModeInWrite', 'CORRECTED'), 
    ('spark.sql.legacy.parquet.datetimeRebaseModeInRead', 'CORRECTED'), 
    ('spark.sql.legacy.parquet.datetimeRebaseModeInWrite', 'CORRECTED')
    ])
conf = SparkConf().set('spark.sql.legacy.parquet.datetimeRebaseModeInRead','CORRECTED')
sc = SparkSession.builder.config(conf=spark_conf).enableHiveSupport().getOrCreate()
#sc = SparkContext(conf=conf)
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)

Job Execution

Memory Load

CPU Load



Solution 1:[1]

Got the answer from AWS support that the 'glue.driver.ExecutorAllocationManager.executors.numberAllExecutors' and 'glue.driver.ExecutorAllocationManager.executors.numberMaxNeededExecutors'metrics only are available for Glue 1.0 ETL job.

For Glue 2.0 and 3.0 one can use the 'glue.executorId' metric name to find out how many executors were launched and the metrics for each executor. This is done in the Cloudwatch console.

Other useful insights can be found by setting up the Spark UI and analyzing the job run using those visuals.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Qwaz