'SparkException: Job aborted

Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 5 in stage 76.0 failed 4 times, most recent failure: Lost task 5.3 in stage 76.0 (TID 2334) (10.139.64.5 executor 6): com.databricks.sql.io.FileReadException: Error while reading file <File_Path> It is possible the underlying files have been updated. You can explicitly invalidate the cache in Spark by running 'REFRESH TABLE tableName' command in SQL or by recreating the Dataset/DataFrame involved. If Delta cache is stale or the underlying files have been removed, you can invalidate Delta cache manually by restarting the cluster.

Solution 1:^[1]

Additionally to what the answer by AbhishekKhandave-MT suggests what you can try is explicitly repairing the table:

FSCK REPAIR TABLE delta.`path/to/delta`

This also fixes scenarios where the underlying files of the table have actually been changed without it being reflected in the "_delta_log" transaction log.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	restlessmodem

'SparkException: Job aborted

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]