'What is the benefit of using more then 1 driver core in spark yarn cluster mode?

what is the difference in using 1 vs 2 driver core in spark yarn cluster mode? If i use 2 driver cores in yarn cluster mode, then spark driver will be relaunched incase of failure? If so, how many retry if would do before failing?

Appreciate if anyone can share any article on this?



Solution 1:[1]

When you launch application in YARN cluster mode, it will create container for your driver.

This container - depending on your application - might need multiple cores and multiple gigs of memory. It all depends on how many sessions will connect to your Spark application at the same time and on complexity of your query.

If it looks like your query compiles slowly or your Spark Web UI/app hangs, it might be worth it to increase core count.

From the point of YARN, there is still only one driver container.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Martin Sucharda