'pyspark client no result from spark server in docker but is connecting

I have a spark cluster running in a docker container. I have a pyspark simple example program to test my configuration which is running on my desktop outside the docker container. The spark console gets and executes the job and completes the job. However the pyspark client never gets the results. image of spark console

The pyspark program's console shows:

" Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 22/03/05 11:42:23 WARN ProcfsMetricsGetter: Exception when trying to compute pagesize, as a result reporting of ProcessTree metrics is stopped 22/03/05 11:42:28 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 22/03/05 11:42:43 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 22/03/05 11:42:58 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 22/03/05 11:43:13 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 22/03/05 11:43:28 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 22/03/05 11:43:43 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources "

I know this is false since the job executed on the server.

If I click the kill link on the server the pyspark program immediately gets:

22/03/05 11:46:22 ERROR Utils: Uncaught exception in thread stop-spark-context org.apache.spark.SparkException: Exception thrown in awaitResult: at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:301) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75) at org.apache.spark.deploy.client.StandaloneAppClient.stop(StandaloneAppClient.scala:287) at org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend.org$apache$spark$scheduler$cluster$StandaloneSchedulerBackend$$stop(StandaloneSchedulerBackend.scala:259) at org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend.stop(StandaloneSchedulerBackend.scala:131) at org.apache.spark.scheduler.TaskSchedulerImpl.stop(TaskSchedulerImpl.scala:927) at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:2567) at org.apache.spark.SparkContext.$anonfun$stop$12(SparkContext.scala:2086) at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1442) at org.apache.spark.SparkContext.stop(SparkContext.scala:2086) at org.apache.spark.SparkContext$$anon$3.run(SparkContext.scala:2035) Caused by: org.apache.spark.SparkException: Could not find AppClient.

Thoughts on how to fix this?



Solution 1:[1]

There can be multiple reasons for it, as you are running spark client in docker container there is possibility that your container is not reachable from spark nodes while the reverse is possible, that's why your spark session gets created but gets killed in few seconds after it.

You should make your container accessible from spark nodes to make network connection complete. If in error message you are seeing some DNS name which might be container name in most cases, map it to docker container's host ip in /etc/hosts file on all nodes of spark cluster.

Hope it helps.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Mousam Singh