'Spark Stage Retry of Completed Stages
I have a large Spark SQL (v 2.4) that joins two hive tables and then aggregates them. It is reading a table more than 1 TB and another is 500GB+.
On Spark UI, I am seeing Stage ID = 2 in completed Stages. But it keeps on adding new Retry stages and each time with different number of Tasks for each Retry stage.
What exactly is happening here? Can you all please point me to a any documentation on how can a Completed Stage be Retried by Spark?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|

