'Upgrading EKS node group from 1.17 to 1.18 failed due to AsgInstanceLaunchFailures - how do I fix it?
I have an EKS cluster that has gone through an upgrade from 1.17 to 1.18. The cluster has 2 node groups (updated using the AWS console).
EKS control plane and one of the node groups upgrades were ok.
The last node group the upgrade is failing due to a health issue - AsgInstanceLaunchFailures - One or more target groups not found. Validating load balancer configuration failed. and now the node group is marked as Degraded.
when I access the update ID I see the following error:
NodeCreationFailure - Couldn't proceed with upgrade process as new nodes are not joining node group {NODE_GROUP_NAME}
I tried accessing the ASG with that ID and I can see it has several load-balancing target groups attached to it. I could not find any way to fix this in the AWS docs.
Any advice?
Solution 1:[1]
Issue resolved.
it appears there was an empty target group added manually to the cluster (there were 3 other target groups created automatically). Once the empty target group was deleted, the upgrade was completed successfully.
I am still unclear as to how EKS chooses the proper target group to update when there is more than one.
Solution 2:[2]
Are you able to launch new nodes which are coming up in Ready State and joining cluster? Based on EKS public doc, the upgrade request would succeed only when ASG can launch new instances in Ready state in all the AZs of the node group.
To debug this further, you can trigger a new upgrade request and check the health of new nodes which EKS brings up in your cluster.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | JuliaS |
| Solution 2 | Ravi Sinha |
