'Caused by: java.net.SocketTimeoutException: Accept timed out
I am getting this error while running pyspark package in pycharm using python 3.9 using this below code.
from pyspark.sql import SparkSession
from pyspark.sql.types import *
from pyspark.sql.functions import*
spark = SparkSession.builder.getOrCreate()
I'm doing data analytic want print this Dataframe out:
df = spark.createDataFrame(ds_all.cube('start_station_name').count().collect())
df = df.sort(df['count'].desc()).show(10)
print(df)
There are the error message,still have a lot of did not post it on:
ERROR TaskSetManager: Task 3 in stage 15.0 failed 1 times;
aborting job
Traceback (most recent call last):
File "C:\Users\User\PycharmProjects\pythonProject\Case
Study 1.py", line 67, in <module>
df = df.sort(df['count'].desc()).show(10)
File
"C:\Users\User\PycharmProjects\pythonProject\venv\lib\site-
packages\pyspark\sql\dataframe.py", line 494, in show
print(self._jdf.showString(n, 20, vertical))
File
"C:\Users\User\PycharmProjects\pythonProject\venv\lib\site-
packages\py4j\java_gateway.py", line 1321, in __call__
return_value = get_return_value(
File
"C:\Users\User\PycharmProjects\pythonProject\venv\lib\site-
packages\pyspark\sql\utils.py", line 111, in deco
return f(*a, **kw)
File
"C:\Users\User\PycharmProjects\pythonProject\venv\lib\site-
packages\py4j\protocol.py", line 326, in get_return_value
raise Py4JJavaError(
py4j.protocol.Py4JJavaError: An error occurred while
calling o196.showString.
: org.apache.spark.SparkException: Job aborted due to stage
failure: Task 3 in stage 15.0 failed 1 times, most recent
failure: Lost task 3.0 in stage 15.0 (TID 105) (LAPTOP-
OUSSK286 executor driver): org.apache.spark.SparkException:
Python worker failed to connect back.
.....
atorg.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker
(PythonWorkerFactory.scala:188)
at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:108)
at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:121)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:162)
at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:65)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketTimeoutException: Accept timed out
at java.net.DualStackPlainSocketImpl.waitForNewConnection(Native Method)
at java.net.DualStackPlainSocketImpl.socketAccept(DualStackPlainSocketImpl.java:131)
at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:535)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:189)
at java.net.ServerSocket.implAccept(ServerSocket.java:545)
at java.net.ServerSocket.accept(ServerSocket.java:513)
at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker
(PythonWorkerFactory.scala:175)
... 32 more
I'm a novice on pyspark,already google it but couldn't find a solution to solve this error. I would appreciate if any one could help me with this,I am using Windows 11 64 bit.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
