'How do I use pod anti-affinity to allocate a kernel-dangerous k8s pod to its own HW?
I have a k8s based jupyterhub cluster (built using the z2jh helm chart). Each user runs jupyterlab in their own private pod, but due to the leakiness of the GPU abstraction (and other things like kernel IO contention). I have a destructively productive scientist user, kev, whos heavy ML training workload routinely brings down other services.
Amongst the k8s labels available on my particularly-demanding-user-workload's pod is:
hub.jupyter.org/username: kev
I'd like to allocate this kev's pod to its own hardware, apart from other user's pods, my web services, etc.
I thought this might be achievable using pod anti affinity (https://zero-to-jupyterhub.readthedocs.io/en/latest/resources/reference.html#singleuser-extrapodantiaffinity-required), which I hope to apply unconditionally to all my user's pods, but will only activate when its kev's pod, i.e. pod label matches labelhub.jupyter.org/username: kev
I've been trying to fashion a k8s pod anti affinity selector (https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#labelselector-v1-meta) that I could apply to all pods, and would result in pods that match hub.jupyter.org/username: kev having anti-affinity for all other pods (matching a label, e.g. I would hope to match kev's pod having anti-affinity for any other pod with label app: myjupcluster), but I'm either blocking on getting the logic right, or perhaps that's a sign that this isn't a do-able match with the level of operators/logic available to pod anti-affinity selectors?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
