'Kubernetes AntiAffinity - limit max number of same pods per node

I have a kubernetes cluster with 4 nodes. I have a pod deployed as a deployment, with 8 replicas. When I deployed this, kubernetes sometimes schedule 4 pods in node1, and the rest of the 4 pods in node2. In this case node3 and node4 don't have this container running (but other containers running there)

I do understand Pod affinity and anti-affinity , where they have the Zookeeper example for pod-anti-affinity, which is great. This would make sure that no 2 pods would deploy on the same node.

This is fine, however my requirement is slightly different where I want to restrict the maximum number of the pods k8s can deploy to one node with node anti-affinity.

I need to make sure that not more than 3 instance of same pods are deployed on a node in my above example. I thought of setting a memory/cpu limit on pods but that seemed like a bad idea as I have nodes with different configuration. Is there any way to achieve this?

( Update 1 ) - I understand that my questions wasn't clear enough. To clarify further, what I want is to limit the instance of a pod to a maximum of 3 per node for a particular deployment. Example, how do I tell k8s to not deploy more than 3 instances of nginx pod per node? The restriction should only be applied to the nginx deployments and not other deployments.

( Update 2 ) - To further explain with a scenario. A k8s cluster, with 4 worker nodes. 2 Deployments

  1. A nginx deployment -> replicas = 10
  2. A custom user agent deployment -> Replicas 10

Requirement - Hey kubernetes, I want to schedule 10 Pods of the "custom user agent" pod (Pod #2 in this example) in 4 nodes, but I want to make sure that each node may have only a maximum of 3 pods of the 'custom user agent'. For the 'nginx' pod, there shouldnt' be any such restriction, which means I don't mind if k8s schedule 5 nginx in one node and the rest of the 5 in the second node.



Solution 1:[1]

I myself didn't find official documentation for this. but i think you can use podantiaffinity with preferredDuringSchedulingIgnoredDuringExecution option. this will prevent k8s from placing the same pods on a single node, but if that is not possible it will select the most eligible existing node. official doc here

affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          labelSelector:
            matchLabels:
              name: deployment-name
          topologyKey: kubernetes.io/hostname
        weight: 100

Solution 2:[2]

So setting a bare minimum number of pod for each node can be achieved by topologykey

Yes, you can achieve a pod to spawn on every node by a deployment object by pod affinity with topologykey set as "kubernetes.io/hostname".

With the above example, you will have the following behaviour:

topology key

I hope thats what you are looking for: enter image description here

Solution 3:[3]

That feature is in alpha, I believe it is called topologyKey, depending on your Kubernetes version you may be able to use it. https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/

Solution 4:[4]

I believe what you want to achieve can be done via maxSkew parameter of pod topology spread constraints. Please check the original documentation https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Vineet Reynolds
Solution 2 redzack
Solution 3 V3RL4223N3
Solution 4 tinybiscuits