'GKE Autopilot - Containers stuck in init phase on particular node
I'm using GKE's Autopilot Cluster to run some kubernetes workloads. Pods getting scheduled to one of the allocated nodes is taking around 10 mins stuck in init phase. Same pod in different node is up in seconds.
deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: jobs
spec:
replicas: 1
selector:
matchLabels:
app: job
template:
metadata:
labels:
app: job
spec:
volumes:
- name: shared-data
emptyDir: {}
initContainers:
- name: init-volume
image: gcr.io/dummy_image:latest
imagePullPolicy: Always
resources:
limits:
memory: "1024Mi"
cpu: "1000m"
ephemeral-storage: "10Gi"
volumeMounts:
- name: shared-data
mountPath: /data
command: ["/bin/sh","-c"]
args:
- cp -a /path /data;
containers:
- name: job-server
resources:
requests:
ephemeral-storage: "5Gi"
limits:
memory: "1024Mi"
cpu: "1000m"
ephemeral-storage: "10Gi"
image: gcr.io/jobprocessor:latest
imagePullPolicy: Always
volumeMounts:
- name: shared-data
mountPath: /ebdata1
This happens only if container has init container. In my case, I'm copying some data from dummy container to shared volume which I'm mounting on actual container.. But whenever pods get scheduled to this particular node, it gets stuck in init phase for around 10 minutes and automatically gets resolved. I couldn't see any errors in event logs.
kubectl describe node problematic-node
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning SystemOOM 52m kubelet System OOM encountered, victim process: cp, pid: 477887
Warning OOMKilling 52m kernel-monitor Memory cgroup out of memory: Killed process 477887 (cp) total-vm:2140kB, anon-rss:564kB, file-rss:768kB, shmem-rss:0kB, UID:0 pgtables:44kB oom_score_adj:-997
Only message is the above warning. Is this issue caused by some misconfiguration from my side?
Solution 1:[1]
The best recommendation is for you to manage container compute resources properly within your Kubernetes cluster. When creating a Pod, you can optionally specify how much CPU and memory (RAM) each Container needs to avoid OOM situations.
When Containers have resource requests specified, the scheduler can make better decisions about which nodes to place Pods on. And when Containers have their limits specified, contention for resources on a node can be handled in a specified manner. CPU specifications are in units of cores, and memory is specified in units of bytes.
An event is produced each time the scheduler fails, use the command below to see the status of events:
$ kubectl describe pod <pod-name>| grep Events
Also, read the official Kubernetes guide on “Configure Out Of Resource Handling”. Always make sure to:
reserve 10-20% of memory capacity for system daemons like kubelet and OS kernel identify pods which can be evicted at 90-95% memory utilization to reduce thrashing and incidence of system OOM.
To facilitate this kind of scenario, the kubelet would be launched with options like below:
--eviction-hard=memory.available<xMi
--system-reserved=memory=yGi
Having Heapster container monitoring in place must be helpful for visualization. Read more reading on Kubernetes and Docker Administration.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Nestor Daniel Ortega Perez |
