'Issues with outbound connections from pods on GKE cluster with NAT (and router)
I'm trying to investigate issue with random 'Connection reset by peer' error or long (up 2 minutes) PDO connection initializations but failing to find a solution.
Similar issue: https://kubernetes.io/blog/2019/03/29/kube-proxy-subtleties-debugging-an-intermittent-connection-reset/, but that supposed to be fixed in the version of kubernetes that I'm running.
GKE config details: GKE is running on 1.20.12-gke.1500 version, with a NAT network configuration and a router. Cluster has 2 nodes and router has 2 static IP's assigned with dynamic port allocation and range of 32728-65536 ports per VM.
On the kubernetes:
- deployments: docker image with local nginx, php-fpm, and google sql proxy
- services: LoadBalancer to expose the deployment
As per replication of the issue I created a simple script connecting in a loop to database and making simple count query. I eliminated issues with the database server by testing the script on a standalone GCE VM where I didn't get any issues. When I'm running the script on any of the application pods in the cluster, I'm getting random 'Connection reset by peer' errors. I have tested that script using google sql proxy service or with direct database IP with same random connection issues.
Any help would be appreciated.
Update
On https://cloud.google.com/kubernetes-engine/docs/release-notes I can see that there has been fix released to solve potentially something that I'm getting: "The following GKE versions fix a known issue in which random TCP connection resets might happen for GKE nodes that use Container-Optimized OS with Docker (cos). To fix the issue, upgrade your nodes to any of these versions:"
I'm updating nodes this evening so I hope that will solve the issue.
Update
The update of nodes solved random connection resets.
Solution 1:[1]
Updating cluster and nodes to 1.20.15-gke.3400 version using google cloud panel resolved the issue.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Maciek W. |
