'Are anti-affinity rules from existing pods able to prevent other deployments in Kubernetes?
We have a Kubernetes cluster on AWS (EKS) with several nodegroups, each running several nodes on several availability zones. On this EKS cluster we run a MongoDB cluster in nodegroup acceptance-cluster-ng-mongo-db. Every node in this nodegroup runs on a different availability zone.
We now want to run another MongoDB cluster in a different nodegroup (acceptance-cluster-ng-another-mongo-db, real name redacted). We created a completely new nodegroup on the existing availability zones. Every node in this nodegroup runs on a different availability zone, which are the same availability zones as the existing MongoDB cluster. The pods do not want to spin up however, regardless the affinity/anti-affinity rules that we use. We did tried 3 separate attemps.
For the existing MongoDB cluster we always used the following affinity/anti-affinity rules:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: alpha.eksctl.io/nodegroup-name
operator: In
values:
- acceptance-cluster-ng-mongo-db
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- mongodb
topologyKey: "topology.kubernetes.io/zone"
ATTEMPT 1
For the new MongoDB cluster we started by using the following rules:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: alpha.eksctl.io/nodegroup-name
operator: In
values:
- acceptance-cluster-ng-another-mongo-db
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- another-mongodb
topologyKey: "topology.kubernetes.io/zone"
This gave us the message:
0/11 nodes are available: 3 node(s) didn't match pod affinity/anti-affinity rules, 3 node(s) didn't satisfy existing pods anti-affinity rules, 8 node(s) didn't match Pod's node affinity/selector.
ATTEMPT 2
We then tried to specify a specific host (which lives in the desired nodegroup) with the following config (no anti-affinity):
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- ip-xxx-xxx-xxx-xxx.xxx.compute.internal
This gave us the message:
0/11 nodes are available: 1 node(s) didn't match pod affinity/anti-affinity rules, 1 node(s) didn't satisfy existing pods anti-affinity rules, 10 node(s) didn't match Pod's node affinity/selector.
ATTEMPT 3:
We then tried to deploy without any affinity/anti-affinity rules. This gave us the message:
0/11 nodes are available: 11 node(s) didn't match pod affinity/anti-affinity rules, 11 node(s) didn't satisfy existing pods anti-affinity rules.
CONCLUSION:
So after several days of tinkering around, we are out of ideas and we're wondering whether the first MongoDB deployment with its anti-affinity rules is bothering our second deployment, even though the nodegroup is different and we think these rules should only apply to those specific pods of the existing Mongo cluster?
Also, I should mention that we see no taint on the nodes in nodegroup acceptance-cluster-ng-another-mongo-db. Each node is running on v1.21.5-eks-9017834
Edit: Here is the describe from a pod that will not start, using the configuration which tried to run on a specific host in the desired nodegroup:
Name: another-mongodb-0
Namespace: acceptance
Priority: 0
Node: <none>
Labels: app=mongodb
app.kubernetes.io/component=mongodb
app.kubernetes.io/instance=another-mongodb
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=mongodb
controller-revision-hash=another-mongodb-68b848c5f9
helm.sh/chart=mongodb-10.0.5
statefulset.kubernetes.io/pod-name=another-mongodb-0
Annotations: kubernetes.io/psp: eks.privileged
Status: Pending
IP:
IPs: <none>
Controlled By: StatefulSet/another-mongodb
Containers:
mongodb:
Image: docker.io/bitnami/mongodb:4.4.2-debian-10-r0
Port: 27017/TCP
Host Port: 0/TCP
Command:
/scripts/setup.sh
Liveness: exec [mongo --eval db.adminCommand('ping')] delay=30s timeout=5s period=10s #success=1 #failure=6
Readiness: exec [mongo --eval db.adminCommand('ping')] delay=5s timeout=5s period=10s #success=1 #failure=6
Environment:
BITNAMI_DEBUG: false
MY_POD_NAME: another-mongodb-0 (v1:metadata.name)
MY_POD_NAMESPACE: acceptance (v1:metadata.namespace)
K8S_SERVICE_NAME: another-mongodb-headless
MONGODB_INITIAL_PRIMARY_HOST: another-mongodb-0.$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local
MONGODB_REPLICA_SET_NAME: ANOTHER-MONGODB-ACC
MONGODB_ADVERTISED_HOSTNAME: $(MY_POD_NAME).$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local
MONGODB_ROOT_PASSWORD: <set to the key 'mongodb-root-password' in secret 'another-mongodb'> Optional: false
MONGODB_REPLICA_SET_KEY: <set to the key 'mongodb-replica-set-key' in secret 'another-mongodb'> Optional: false
ALLOW_EMPTY_PASSWORD: no
MONGODB_SYSTEM_LOG_VERBOSITY: 0
MONGODB_DISABLE_SYSTEM_LOG: no
MONGODB_ENABLE_IPV6: no
MONGODB_ENABLE_DIRECTORY_PER_DB: no
Mounts:
/bitnami/mongodb from datadir (rw)
/scripts/setup.sh from scripts (rw,path="setup.sh")
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-2z5qv (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
datadir:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: datadir-another-mongodb-0
ReadOnly: false
scripts:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: another-mongodb-scripts
Optional: false
kube-api-access-2z5qv:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 21s (x6 over 4m30s) default-scheduler 0/11 nodes are available: 1 node(s) didn't match pod affinity/anti-affinity rules, 1 node(s) didn't satisfy existing pods anti-affinity rules, 10 node(s) didn't match Pod's node affinity/selector.
Edit: Here is a description of the node where we would expect the pod to run on with our specific node affinity:
NAME STATUS ROLES AGE VERSION LABELS
ip-xxx-xxx-xxx-xxx.eu-central-1.compute.internal Ready <none> 4d6h v1.21.5-eks-9017834 alpha.eksctl.io/cluster-name=acceptance-cluster,alpha.eksctl.io/nodegroup-name=acceptance-cluster-ng-another-mongo-db,beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=t3.small,beta.kubernetes.io/os=linux,eks.amazonaws.com/capacityType=ON_DEMAND,eks.amazonaws.com/nodegroup-image=ami-xxx,eks.amazonaws.com/nodegroup=acceptance-cluster-ng-another-mongo-db,eks.amazonaws.com/sourceLaunchTemplateId=lt-xxx,eks.amazonaws.com/sourceLaunchTemplateVersion=1,failure-domain.beta.kubernetes.io/region=eu-central-1,failure-domain.beta.kubernetes.io/zone=eu-central-1a,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-xxx-xxx-xxx-xxx.eu-central-1.compute.internal,kubernetes.io/os=linux,node.kubernetes.io/instance-type=t3.small,topology.ebs.csi.aws.com/zone=eu-central-1a,topology.kubernetes.io/region=eu-central-1,topology.kubernetes.io/zone=eu-central-1a```
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
