'Get https://[10.96.0.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.96.0.1:443: i/o timeout]

my pod stucks in ContainerCreating status with this massage :

Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "483590313b7fd092fe5eeec92356152721df3e971d942174464ac5a3f1529898" network for pod "my-nginx": networkPlugin cni failed to set up pod "my-nginx_default" network: CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "483590313b7fd092fe5eeec92356152721df3e971d942174464ac5a3f1529898", failed to clean up sandbox container "483590313b7fd092fe5eeec92356152721df3e971d942174464ac5a3f1529898" network for pod "my-nginx": networkPlugin cni failed to teardown pod "my-nginx_default" network: error getting ClusterInformation: Get https://[10.96.0.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.96.0.1:443: i/o timeout]

the state of worker node is Ready .

but the output of kubectl get pods -n kube-system seems to have issues :

NAME                                       READY   STATUS                   RESTARTS   AGE
calico-kube-controllers-6dfcd885bf-ktbhb   1/1     Running                  0          22h
calico-node-4fs2v                          0/1     Init:RunContainerError   1          22h
calico-node-l9qvc                          0/1     Running                  0          22h
coredns-f9fd979d6-8pzcd                    1/1     Running                  0          23h
coredns-f9fd979d6-v4cq8                    1/1     Running                  0          23h
etcd-k8s-master                            1/1     Running                  1          23h
kube-apiserver-k8s-master                  1/1     Running                  128        23h
kube-controller-manager-k8s-master         1/1     Running                  4          23h
kube-proxy-bwtwj                           0/1     CrashLoopBackOff         342        23h
kube-proxy-stq7q                           1/1     Running                  1          23h
kube-scheduler-k8s-master                  1/1     Running                  4          23h
                                   

and the resualt of command kubectl -n kube-system logs kube-proxy-bwtwj the resulst was :

failed to try resolving symlinks in path "/var/log/pods/kube-system_kube-proxy-bwtwj_1a0f4b93-cc6f-46b9-bf29-125feba593cb/kube-proxy/348.log": lstat /var/log/pods/kube-system_kube-proxy-bwtwj_1a0f4b93-cc6f-46b9-bf29-125feba593cb/kube-proxy/348.log: no such file or directory


Solution 1:[1]

I see two topics here:

  1. The default --pod-network-cidr for calico is 192.168.0.0/16. You can use a different one but always make sure that there are no overlays. However, I have tested with the default one and my cluster runs with no problems. In order to start over with a proper config, you should Remove the node and Clean up the control plane. Than proceed with:
  • kubeadm init --pod-network-cidr=192.168.0.0/16

  • mkdir -p $HOME/.kube

  • cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

  • chown $(id -u):$(id -g) $HOME/.kube/config

  • kubectl apply -f https://docs.projectcalico.org/v3.14/manifests/calico.yaml

  • After that, join your worker nodes with kubeadm join

Use sudo where/if needed. All necessary details can be found in the official documentation.

  1. The failed to try resolving symlinks error means that kubelet is looking for the pod logs in a wrong directory. In order to fix it you need to pass the --log-dir=/var/log flag to kubelet. After adding the flag you have run systemctl daemon-reload so the kubelet would be restarted. This has to be done on all of your nodes.

Solution 2:[2]

Make sure you deploy calico before joining other nodes to your cluster. When you have other nodes in your cluster calico-kube-controllers sometimes gets push to a worker node. This can lead to issues

Solution 3:[3]

You need to carefully check logs for calico-node pods. In my case i have some other network interfaces and the autodetection mechanism in calico was detecting wrong interface (ip address). You need to consult this documentation https://projectcalico.docs.tigera.io/reference/node/configuration.

What i did in my case, was simply:

kubectl set env daemonset/calico-node -n kube-system IP_AUTODETECTION_METHOD=cidr=172.16.8.0/24

cidr is my "working network". After this, all calico-nodes restarted and suddenly everything was fine.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Wytrzymały Wiktor
Solution 2 Orion
Solution 3 Przemek