'Kubernetes nginx ingress controller is unreliable
I need help understanding in detail how an ingress controller, specifically the ingress-nginx ingress controller, is supposed to work. To me, it appears as a black box that is supposed to listen on a public IP, terminate TLS, and forward traffic to a pod. But exactly how that happens is a mystery to me.
The primary goal here is understanding, the secondary goal is troubleshooting an immediate issue I'm facing.
I have a cluster with five nodes, and am trying to get the Jupyterhub application to run on it. For the most part, it is working fine. I'm using a pretty standard Rancher RKE setup with flannel/calico for the networking. The nodes run RedHat 7.9 with iptables and firewalld, and docker 19.03.
The Jupyterhub proxy is set up with a ClusterIP service (I also tried a NodePort service, that also works). I also set up an ingress. The ingress sometimes works, but oftentimes does not respond (connection times out). Specifically, if I delete the ingress, and then redeploy my helm chart, the ingress will start working. Also, if I restart one of my nodes, the ingress will start working again. I have not identified the circumstances when the ingress stops working.
Here are my relevant services:
kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hub ClusterIP 10.32.0.183 <none> 8081/TCP 378d
proxy-api ClusterIP 10.32.0.11 <none> 8001/TCP 378d
proxy-public ClusterIP 10.32.0.30 <none> 80/TCP 378d
This works; telnet 10.32.0.30 80 responds as expected (of course only from one of the nodes). I can also telnet directly to the proxy-public pod (10.244.4.41:8000 in my case).
Here is my ingress.
kubectl describe ingress
Name: jupyterhub
Labels: app=jupyterhub
app.kubernetes.io/managed-by=Helm
chart=jupyterhub-1.2.0
component=ingress
heritage=Helm
release=jhub
Namespace: jhub
Address: k8s-node4.<redacted>,k8s-node5.<redacted>
Default backend: default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
TLS:
tls-jhub terminates jupyterhub.<redacted>
Rules:
Host Path Backends
---- ---- --------
jupyterhub.<redacted>
/ proxy-public:http (10.244.4.41:8000)
Annotations: field.cattle.io/publicEndpoints:
[{"addresses":["",""],"port":443,"protocol":"HTTPS","serviceName":"jhub:proxy-public","ingressName":"jhub:jupyterhub","hostname":"jupyterh...
meta.helm.sh/release-name: jhub
meta.helm.sh/release-namespace: jhub
Events: <none>
What I understand so far about the ingress in this situation:
Traffic arrives on port 443 at k8s-node4 or k8s-node5. Some magic (controlled by the ingress controller) receives that traffic, terminates TLS, and sends the unencrypted traffic to the pod's IP at port 8000. That's the part I want to understand better.
That black box seems to at least partially involve flanel/calico and some iptables magic, and it also obviously involves nginx at some point.
Update: in the meantime, I identified what causes Kubernetes to break: restarting firewalld.
As best I can tell, that wipes out all iptables rules, not just the firewalld-generated ones.
Solution 1:[1]
I found the answer to my question here: https://www.stackrox.io/blog/kubernetes-networking-demystified/ There probably is a caveat that this may vary to some extent depending on which networking CNI you are using, although everything I saw was strictly related to Kubernetes itself.
I'm still trying to digest the content of this blog, and I highly recommend referring directly to that blog, instead of relying on my answer, which could be a poor retelling of the story.
Here is approximately how a package that arrives on port 443 flows.
You will need to use the command to see the tables.
iptables -t nat -vnL | less
The output of this looks rather intimidating.
The below cuts out a lot of other chains and calls to cut to the chase. In this example:
- This cluster uses the CNI plugin for Calico/channel/Flannel.
- Listen port is 443
- The pod for the nginx-ingress-controller listens (among others) at 10.244.0.183.
In that situation, here is how the packet flows:
- The packet comes into the PREROUTING chain.
- The PREROUTING chain calls (among other things) the CNI-HOSTPORT-DNAT chain.
- The POSTROUTING chain also calls the same chain.
- The CNI-HOSTPORT-DNAT chain in turn calls several CNI-DN-xxxx chains.
- The CNI-DN-xxx chains perform DNAT and change the destination address to 10.244.0.183.
- The container inside the nginx-ingress-controller listens on 10.244.0.183.
There is some additional complexity involved if the pod is on a different node than the packet arrived in, and also if multiple pods are load-balanced for the same port. Load balancing seems to be handled with the iptables statistics module randomly picking one or the other iptables rule.
Internal traffic from a service to a pod follows a similar flow, but not the same.
In this example:
- The service is at 10.32.0.183, port 8001
- The pod is at 10.244.6.112, port 8001.
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
...
KUBE-SERVICES all -- * * 0.0.0.0/0 0.0.0.0/0
Chain KUBE-SERVICES (2 references)
...
/* Traffic from within the cluster to 10.32.0.183:8001 */
0 0 KUBE-SVC-ZHCKOT5PFJF4PASJ tcp -- * * 0.0.0.0/0 10.32.0.183 tcp dpt:8001
...
/* Mark the package */
Chain KUBE-SVC-ZHCKOT5PFJF4PASJ (1 references)
pkts bytes target prot opt in out source destination
0 0 KUBE-MARK-MASQ tcp -- * * !10.244.0.0/16 10.32.0.183 tcp dpt:8081
0 0 KUBE-SEP-RYU73S2VFHOHW4XO all -- * * 0.0.0.0/0 0.0.0.0/0
/* Perform DNAT, redirecting from 10.32.0.183 to 10.244.6.12 */
Chain KUBE-SEP-RYU73S2VFHOHW4XO (1 references) 0 0 KUBE-MARK-MASQ all -- * * 10.244.6.112 0.0.0.0/0
0 0 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp to:10.244.6.112:8081
The second part of my question regarding how to get the nodes to work reliably:
- Disable firewalld.
- Use Kubernetes network policies (or use Calico network policies if you are using Calico) instead.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Kevin Keane |
