'Kubernetes Upgrade - 1.17.8 to 1.18.20 fails at first master node (IPv6 only cluster)

Would need some help with debugging the upgrade failure.

Upgrade failure of Ipv6 cluster (3 master nodes) from 1.17.8 to 1.18.20.

Error Traces


/usr/bin/kubeadm upgrade apply v1.18.20 -y
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade/version] You have chosen to change the cluster version to "v1.18.20"
[upgrade/versions] Cluster version: v1.17.8
[upgrade/versions] kubeadm version: v1.18.20
[upgrade/prepull] Will prepull images for components [kube-apiserver kube-controller-manager kube-scheduler etcd]
[upgrade/prepull] Prepulling image for component etcd.
[upgrade/prepull] Prepulling image for component kube-scheduler.
[upgrade/prepull] Prepulling image for component kube-controller-manager.
[upgrade/prepull] Prepulling image for component kube-apiserver.
[apiclient] Found 3 Pods for label selector k8s-app=upgrade-prepull-kube-apiserver
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-kube-scheduler
[apiclient] Found 3 Pods for label selector k8s-app=upgrade-prepull-kube-controller-manager
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-etcd
[apiclient] Found 3 Pods for label selector k8s-app=upgrade-prepull-etcd
[apiclient] Found 3 Pods for label selector k8s-app=upgrade-prepull-kube-scheduler
[upgrade/prepull] Prepulled image for component kube-apiserver.
[upgrade/prepull] Prepulled image for component kube-scheduler.
[upgrade/prepull] Prepulled image for component kube-controller-manager.
[upgrade/prepull] Prepulled image for component etcd.
[upgrade/prepull] Successfully prepulled the images for all the control plane components
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.18.20"...
Static pod: kube-apiserver-cvmonq1 hash: 1c8ea0abd77378cdddfef939bb4154ad
Static pod: kube-controller-manager-cvmonq1 hash: 00363dd5e255940e1cc72f797b04b39f
Static pod: kube-scheduler-cvmonq1 hash: 7d3db474476c65a94558519f47794665
[upgrade/etcd] Upgrading to TLS for etcd
[upgrade/etcd] Non fatal issue encountered during upgrade: the desired etcd version for this Kubernetes version "v1.18.20" is "3.4.3-0", but the current etcd version is "3.4.3". Won't downgrade etcd, instead just continue
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests431381277"
W0418 20:45:26.227640 1725518 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[upgrade/staticpods] Preparing for "kube-apiserver" upgrade
[upgrade/staticpods] Renewing apiserver certificate
[upgrade/staticpods] Renewing apiserver-kubelet-client certificate
[upgrade/staticpods] Renewing front-proxy-client certificate
[upgrade/staticpods] Renewing apiserver-etcd-client certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-apiserver.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2022-04-18-20-45-24/kube-apiserver.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-apiserver-cvmonq1 hash: 1c8ea0abd77378cdddfef939bb4154ad


[upgrade/apply] FATAL: couldn't upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced: timed out waiting for the condition

On adding verbose level debug, timeout seen at

I0418 20:56:37.525538 1734292 round_trippers.go:444] GET https://[2001:420:293:2471:172:29:54:135]:6443/api/v1/namespaces/kube-system/pods/kube-apiserver-cvmonq1?timeout=10s  in 0 milliseconds
I0418 20:56:37.525572 1734292 round_trippers.go:450] Response Headers:
I0418 20:56:37.525817 1734292 round_trippers.go:424] curl -k -v -XGET  -H "Accept: application/json, */*" -H "User-Agent: kubeadm/v1.18.20 (linux/amd64) kubernetes/1f3e19b" 'https://[2001:420:293:2471:172:29:54:135]:6443/api/v1/namespaces/kube-system/pods/kube-apiserver-cvmonq1?timeout=10s'
I0418 20:56:37.526212 1734292 round_trippers.go:444] GET https://[2001:420:293:2471:172:29:54:135]:6443/api/v1/namespaces/kube-system/pods/kube-apiserver-cvmonq1?timeout=10s  in 0 milliseconds
I0418 20:56:37.526237 1734292 round_trippers.go:450] Response Headers:
timed out waiting for the condition
couldn't upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced
k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade.rollbackOldManifests
    /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade/staticpods.go:520
k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade.upgradeComponent
    /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade/staticpods.go:252
k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade.StaticPodControlPlane
    /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade/staticpods.go:477
k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade.PerformStaticPodUpgrade
    /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade/staticpods.go:611
k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade.PerformControlPlaneUpgrade
    /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade/apply.go:223
k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade.runApply
    /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade/apply.go:164
k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade.NewCmdApply.func1
    /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade/apply.go:79
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute
    /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:826
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC
    /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:914
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute
    /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:864
k8s.io/kubernetes/cmd/kubeadm/app.Run
    /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:50
main.main
    _output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25
runtime.main
    /usr/local/go/src/runtime/proc.go:203
runtime.goexit
    /usr/local/go/src/runtime/asm_amd64.s:1357
[upgrade/apply] FATAL
k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade.runApply
    /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade/apply.go:165
k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade.NewCmdApply.func1
    /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade/apply.go:79
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute
    /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:826
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC
    /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:914
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute
    /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:864
k8s.io/kubernetes/cmd/kubeadm/app.Run
    /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:50
main.main
    _output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25
runtime.main
    /usr/local/go/src/runtime/proc.go:203
runtime.goexit
    /usr/local/go/src/runtime/asm_amd64.s:1357
[root@cvmonq1 manifests]# 

Boostrapping of IPV6 direcly in v.1.18.20 works in the same setup without any issues.

Failure observed only in the update path. Any inputs to debug/fix the issue is appreciated.



Solution 1:[1]

The issue you are hitting is because you are trying to upgrade from the current version to the current version.

Checkout the Github link for more information.

Another workaround can be to downgrade your cluster version and then upgrade as mentioned in the link.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Fariya Rahmat