'How to properly downscale a spark cluster using dynamic allocation?
I am running spark on k8s using the spark k8s operator. I want to run a spark streaming application which auto scales. I have figured out how to slowly scale up the cluster using
"spark.dynamicAllocation.executorAllocationRatio": "0.1"
"spark.dynamicAllocation.schedulerBacklogTimeout": "20"
The cluster scales up correctly when resources are needed but does not downscale.
According to the official docs
A Spark application removes an executor when it has been idle for more than spark.dynamicAllocation.executorIdleTimeout seconds
Since I'm running a streaming application, this condition is almost never true (the executorIdleTimeout would have to be < 1 second). The only way my cluster down scales is if any of the executor gets OOMKilled.
Is there any proper way to downscale the cluster? Like killing executors based on the average CPU used? Or is there any way to give priority to an executor instead of distributing tasks over all the available executors?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
