'Spark-Submit command takes a while to run

We have installed Apache Hadoop and Spark on a cluster of servers running on IBM AIX (version 2) OS.

Hadoop version - hadoop-3.2.1 Spark version - spark-3.0.1

We are testing the overall install of Spark by running the spark-submit --version command found under $SPARK_HOME/bin folder. This command runs intermittently. When you run it the first time, the command runs with no delay. Running the command multiple times afterwards takes a long time to execute (around 30 - 40 minutes). We have checked the server CPU and memory - there is no issues with low memory or applications hogging the resources of the processor. We are not able to pinpoint where the delay is when this command runs.

This Hadoop/Spark setup is working in a cluster running Red Hat 7.9. We do not face this issue in this environment.

This is my first time asking a question on Stack Overflow. Please let me know if there is anymore information I need to provide.

Thanks in advance.

=========================== Edit May 11th:

Successful log run (debug lines were added in the spark-submit command)

bash-5.0$ spark-submit --version
Entered spark submit
About to execute spark submit command.....
About to load spark env.sh
Loaded spark env.sh
Entered statement to create RUNNER
searching spark_home/jars
Loaded spark jars DIR
Launching class path
Launched class path
Entering build command
Completed build command
About to enter while block
Entered while block for Entered build command
Entered build command
CMD is
build_command is  and org.apache.spark.deploy.SparkSubmit --version
Entered while block for
For  changing delim to blank
CMD is
build_command is  and org.apache.spark.deploy.SparkSubmit --version
Entered while block for /u01/app/java8_64/bin/java
Entered if condition for /u01/app/java8_64/bin/java
CMD is /u01/app/java8_64/bin/java
build_command is  and org.apache.spark.deploy.SparkSubmit --version
Entered while block for -cp
Entered if condition for -cp
CMD is /u01/app/java8_64/bin/java -cp
build_command is  and org.apache.spark.deploy.SparkSubmit --version
Entered while block for /u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/conf/:/u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/jars/*:/u01/app/rmb/ria/AnthemSpark/hadoop-3.2.1/etc/hadoop/
Entered if condition for /u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/conf/:/u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/jars/*:/u01/app/rmb/ria/AnthemSpark/hadoop-3.2.1/etc/hadoop/
CMD is /u01/app/java8_64/bin/java -cp /u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/conf/:/u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/jars/*:/u01/app/rmb/ria/AnthemSpark/hadoop-3.2.1/etc/hadoop/
build_command is  and org.apache.spark.deploy.SparkSubmit --version
Entered while block for -Xmx1g
Entered if condition for -Xmx1g
CMD is /u01/app/java8_64/bin/java -cp /u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/conf/:/u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/jars/*:/u01/app/rmb/ria/AnthemSpark/hadoop-3.2.1/etc/hadoop/ -Xmx1g
build_command is  and org.apache.spark.deploy.SparkSubmit --version
Entered while block for org.apache.spark.deploy.SparkSubmit
Entered if condition for org.apache.spark.deploy.SparkSubmit
CMD is /u01/app/java8_64/bin/java -cp /u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/conf/:/u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/jars/*:/u01/app/rmb/ria/AnthemSpark/hadoop-3.2.1/etc/hadoop/ -Xmx1g org.apache.spark.deploy.SparkSubmit
build_command is  and org.apache.spark.deploy.SparkSubmit --version
Entered while block for --version
Entered if condition for --version
CMD is /u01/app/java8_64/bin/java -cp /u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/conf/:/u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/jars/*:/u01/app/rmb/ria/AnthemSpark/hadoop-3.2.1/etc/hadoop/ -Xmx1g org.apache.spark.deploy.SparkSubmit --version
build_command is  and org.apache.spark.deploy.SparkSubmit --version
Entered while block for 0
Entered if condition for 0
CMD is /u01/app/java8_64/bin/java -cp /u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/conf/:/u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/jars/*:/u01/app/rmb/ria/AnthemSpark/hadoop-3.2.1/etc/hadoop/ -Xmx1g org.apache.spark.deploy.SparkSubmit --version 0
build_command is  and org.apache.spark.deploy.SparkSubmit --version
CMD is /u01/app/java8_64/bin/java -cp /u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/conf/:/u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/jars/*:/u01/app/rmb/ria/AnthemSpark/hadoop-3.2.1/etc/hadoop/ -Xmx1g org.apache.spark.deploy.SparkSubmit --version 0
completed while block
About to execute /u01/app/java8_64/bin/java -cp /u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/conf/:/u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/jars/*:/u01/app/rmb/ria/AnthemSpark/hadoop-3.2.1/etc/hadoop/ -Xmx1g 
   org.apache.spark.deploy.SparkSubmit --version
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.0.1
      /_/

Using Scala version 2.12.10, IBM J9 VM, 1.8.0_251
Branch HEAD
Compiled by user ubuntu on 2020-08-28T08:58:35Z
Revision 2b147c4cd50da32fe2b4167f97c8142102a0510d
Url https://gitbox.apache.org/repos/asf/spark.git
Type --help for more information.

=============================================================


Failed run:

bash-5.0$ spark-submit --version
Entered spark submit
About to execute spark submit command.....
About to load spark env.sh
Loaded spark env.sh
Entered statement to create RUNNER
searching spark_home/jars
Loaded spark jars DIR
Launching class path
Launched class path
Entering build command
Completed build command
About to enter while block
Entered while block for Entered build command
Entered build command
CMD is
build_command is  and org.apache.spark.deploy.SparkSubmit --version
Entered while block for
For  changing delim to blank
CMD is
build_command is  and org.apache.spark.deploy.SparkSubmit --version
Entered while block for /u01/app/java8_64/bin/java
Entered if condition for /u01/app/java8_64/bin/java
CMD is /u01/app/java8_64/bin/java
build_command is  and org.apache.spark.deploy.SparkSubmit --version
Entered while block for -cp
Entered if condition for -cp
CMD is /u01/app/java8_64/bin/java -cp
build_command is  and org.apache.spark.deploy.SparkSubmit --version
Entered while block for /u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/conf/:/u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/jars/*:/u01/app/rmb/ria/AnthemSpark/hadoop-3.2.1/etc/hadoop/
Entered if condition for /u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/conf/:/u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/jars/*:/u01/app/rmb/ria/AnthemSpark/hadoop-3.2.1/etc/hadoop/
CMD is /u01/app/java8_64/bin/java -cp /u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/conf/:/u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/jars/*:/u01/app/rmb/ria/AnthemSpark/hadoop-3.2.1/etc/hadoop/
build_command is  and org.apache.spark.deploy.SparkSubmit --version
Entered while block for -Xmx1g
Entered if condition for -Xmx1g
CMD is /u01/app/java8_64/bin/java -cp /u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/conf/:/u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/jars/*:/u01/app/rmb/ria/AnthemSpark/hadoop-3.2.1/etc/hadoop/ -Xmx1g
build_command is  and org.apache.spark.deploy.SparkSubmit --version
Entered while block for org.apache.spark.deploy.SparkSubmit
Entered if condition for org.apache.spark.deploy.SparkSubmit
CMD is /u01/app/java8_64/bin/java -cp /u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/conf/:/u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/jars/*:/u01/app/rmb/ria/AnthemSpark/hadoop-3.2.1/etc/hadoop/ -Xmx1g org.apache.spark.deploy.SparkSubmit
build_command is  and org.apache.spark.deploy.SparkSubmit --version
Entered while block for --version
Entered if condition for --version
CMD is /u01/app/java8_64/bin/java -cp /u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/conf/:/u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/jars/*:/u01/app/rmb/ria/AnthemSpark/hadoop-3.2.1/etc/hadoop/ -Xmx1g org.apache.spark.deploy.SparkSubmit --version
build_command is  and org.apache.spark.deploy.SparkSubmit --version
Entered while block for 0
Entered if condition for 0
CMD is /u01/app/java8_64/bin/java -cp /u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/conf/:/u01/app/rmb/ria/AnthemSpark/spark-3.0.1-bin-hadoop3.2/jars/*:/u01/app/rmb/ria/AnthemSpark/hadoop-3.2.1/etc/hadoop/ -Xmx1g org.apache.spark.deploy.SparkSubmit --version 0
build_command is  and org.apache.spark.deploy.SparkSubmit --version

##########################

EDIT - March 12th

This is the last few lines when running the following command - truss -d. The last output shows it going into "sleep".

0.9063:        lseek(0, 0, 1)                   Err#29 ESPIPE
0.9066:        fstatx(0, 0x0FFFFFFFFFFFE8F8, 176, 0) = 0
0.9068:        _sigaction(14, 0x0FFFFFFFFFFFE710, 0x0FFFFFFFFFFFE740) = 0
0.9071:        incinterval(0, 0x0FFFFFFFFFFFE640, 0x0FFFFFFFFFFFE660) = 0
0.9073:        kread(0, " o", 1)                = 1
0.9075:        kread(0, " r", 1)                = 1
0.9078:        kread(0, " g", 1)                = 1
0.9080:        kread(0, " .", 1)                = 1
0.9082:        kread(0, " a", 1)                = 1
0.9084:        kread(0, " p", 1)                = 1
0.9086:        kread(0, " a", 1)                = 1
0.9089:        kread(0, " c", 1)                = 1
0.9091:        kread(0, " h", 1)                = 1
0.9093:        kread(0, " e", 1)                = 1
0.9095:        kread(0, " .", 1)                = 1
0.9097:        kread(0, " s", 1)                = 1
0.9100:        kread(0, " p", 1)                = 1
0.9102:        kread(0, " a", 1)                = 1
0.9104:        kread(0, " r", 1)                = 1
0.9106:        kread(0, " k", 1)                = 1
0.9108:        kread(0, " .", 1)                = 1
0.9111:        kread(0, " d", 1)                = 1
0.9113:        kread(0, " e", 1)                = 1
0.9115:        kread(0, " p", 1)                = 1
0.9117:        kread(0, " l", 1)                = 1
0.9119:        kread(0, " o", 1)                = 1
0.9122:        kread(0, " y", 1)                = 1
0.9124:        kread(0, " .", 1)                = 1
0.9126:        kread(0, " S", 1)                = 1
0.9128:        kread(0, " p", 1)                = 1
0.9130:        kread(0, " a", 1)                = 1
0.9132:        kread(0, " r", 1)                = 1
0.9135:        kread(0, " k", 1)                = 1
0.9137:        kread(0, " S", 1)                = 1
0.9139:        kread(0, " u", 1)                = 1
0.9141:        kread(0, " b", 1)                = 1
0.9143:        kread(0, " m", 1)                = 1
0.9187:        kread(0, " i", 1)                = 1
0.9190:        kread(0, " t", 1)                = 1
0.9192:        kread(0, "\0", 1)                = 1
0.9195:        incinterval(0, 0x0FFFFFFFFFFFE5C0, 0x0FFFFFFFFFFFE5E0) = 0
0.9197:        _sigaction(14, 0x0FFFFFFFFFFFE690, 0x0FFFFFFFFFFFE6C0) = 0
0.9200:        kfcntl(1, F_GETFL, 0x0000000000000000) = 67110914
0.9204:        kfcntl(1, F_GETFL, 0x0000000000000000) = 67110914
0.9207:        kioctl(0, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
0.9211:        lseek(0, 0, 1)                   Err#29 ESPIPE
0.9214:        fstatx(0, 0x0FFFFFFFFFFFE8F8, 176, 0) = 0
0.9216:        _sigaction(14, 0x0FFFFFFFFFFFE710, 0x0FFFFFFFFFFFE740) = 0
0.9219:        incinterval(0, 0x0FFFFFFFFFFFE640, 0x0FFFFFFFFFFFE660) = 0
0.9222:        kread(0, " -", 1)                = 1
0.9224:        kread(0, " -", 1)                = 1
0.9227:        kread(0, " v", 1)                = 1
0.9229:        kread(0, " e", 1)                = 1
0.9231:        kread(0, " r", 1)                = 1
0.9234:        kread(0, " s", 1)                = 1
0.9236:        kread(0, " i", 1)                = 1
0.9238:        kread(0, " o", 1)                = 1
0.9241:        kread(0, " n", 1)                = 1
0.9243:        kread(0, "\0", 1)                = 1
0.9245:        incinterval(0, 0x0FFFFFFFFFFFE5C0, 0x0FFFFFFFFFFFE5E0) = 0
0.9248:        _sigaction(14, 0x0FFFFFFFFFFFE690, 0x0FFFFFFFFFFFE6C0) = 0
0.9251:        kfcntl(1, F_GETFL, 0x0000000000000000) = 67110914
0.9254:        kfcntl(1, F_GETFL, 0x0000000000000000) = 67110914
0.9257:        kioctl(0, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
0.9260:        lseek(0, 0, 1)                   Err#29 ESPIPE
0.9262:        fstatx(0, 0x0FFFFFFFFFFFE8F8, 176, 0) = 0
0.9265:        _sigaction(14, 0x0FFFFFFFFFFFE710, 0x0FFFFFFFFFFFE740) = 0
0.9268:        incinterval(0, 0x0FFFFFFFFFFFE640, 0x0FFFFFFFFFFFE660) = 0
0.9270:        kread(0, " 0", 1)                = 1
0.9273:        kread(0, "\0", 1)                = 1
0.9275:        incinterval(0, 0x0FFFFFFFFFFFE5C0, 0x0FFFFFFFFFFFE5E0) = 0
0.9278:        _sigaction(14, 0x0FFFFFFFFFFFE690, 0x0FFFFFFFFFFFE6C0) = 0
0.9281:        kfcntl(1, F_GETFL, 0x0000000000000000) = 67110914
0.9284:        kfcntl(1, F_GETFL, 0x0000000000000020) = 67110914
0.9287:        kioctl(0, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
0.9290:        lseek(0, 0, 1)                   Err#29 ESPIPE
0.9292:        fstatx(0, 0x0FFFFFFFFFFFE8F8, 176, 0) = 0
0.9295:        _sigaction(14, 0x0FFFFFFFFFFFE710, 0x0FFFFFFFFFFFE740) = 0
0.9297:        incinterval(0, 0x0FFFFFFFFFFFE640, 0x0FFFFFFFFFFFE660) = 0
2.9303:        kread(0, "\t", 1) (sleeping...)


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source