'Shuffle Stage Failing Due To Executor Loss
I get the following error when my spark jobs fails **"org.apache.spark.shuffle.FetchFailedException: The relative remote executor(Id: 21), which maintains the block data to fetch is dead."**
Over view of my spark job
input size is ~35 GB
I have broadcast joined all the smaller tables with the mother table into say a dataframe1 and then i salted each big table and dataframe1 before i join it with dataframe1 (left table).
profile used:
@configure(profile=[
'EXECUTOR_MEMORY_LARGE',
'NUM_EXECUTORS_32',
'DRIVER_MEMORY_LARGE',
'SHUFFLE_PARTITIONS_LARGE'
])
using the above approach and profiles i was able to get the runtime down by 50% but i still get Shuffle Stage Failing Due To Executor Loss issues.
is there a way i can fix this?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
