'Selecting one kubernetes pod among a set of copies to perform a task

I have a java application that runs inside a Kubernetes pod.

The application performs several tasks (taskA, taskB, etc.). The application supports running multiple instances in different pods. All the pods are doing the same tasks.

However, there is a task that should only be done by only one of the pods (e.g. taskA should only run in one of the pods). And if the pod that is performing the specific task dies, one of the other nodes should start doing that task (passive node, with regards to taskA, takes over).

Is there some support for this feature in k8s, or do I need use some other service (e.g. zookeeper)?



Solution 1:[1]

Kubernetes doesn't know what's running inside your pods or what's inside your requests. There's (almost) no direct support for what you're suggesting.

The more general topic you might look up is leader election. In your example, the cluster leader is the only one who can perform the taskA. ZooKeeper is popularly used for this; there's a newer algorithm called Raft that can work too. You'd have to write support for this into your application.

The closest thing in Kubernetes is a StatefulSet. The helpful feature of a StatefulSet is that its replicas are numbered starting from 0, so a pod can look at its own hostname and figure out which one it is. If a lower-number pod fails then Kubernetes will restart it with the same identity. You could set the rule that only pod 0 could perform taskA, which in practice might meet your requirements.

Solution 2:[2]

After researching David's answer, I found the Kubernetes blog recommends a way to do leader election:

https://kubernetes.io/blog/2016/01/simple-leader-election-with-kubernetes/

Newer articles have come out since then with improvements to that algorithm but they are still based on the original technique, according to this article:

https://medium.com/hybrid-cloud-hobbyist/leader-election-architecture-kubernetes-32600da81e3c

It looks like you'll have to copy a bit of code or add in a new external dependency for this.

Solution 3:[3]

Expanding on Malgorzata's answer, you may be able to have a separate single-instance application launch a task by making an HTTP request to your main app, or by publishing a message to a message queue system like Kafka, hopefully without needing to duplicate or share too much code/configuration with your main app. May be some set up costs to deploy it alongside the main app too.

In my case I needed one task to scan an FTP directory and put newly found filenames into Kafka, and I realized my company supported Confluent Kafka connectors so I was able to just use that instead.

Just giving this as an alternative to leader election. Leader election seems more robust if you're willing to implement it -- no need for a separate component that may logically belong in the main app, and has a single pod which can go down and take a long time to restart, ie. single point of failure.

Solution 4:[4]

As far as I understand I think you should use ReplicaSet.

You should create two ReplicaSet, first one for task A, and second one for task B.

A ReplicaSet is defined with fields, including a selector that specifies how to identify Pods it can acquire, a number of replicas indicating how many Pods it should be maintaining, and a pod template specifying the data of new Pods it should create to meet the number of replicas criteria. A ReplicaSet then fulfills its purpose by creating and deleting Pods as needed to reach the desired number. When a ReplicaSet needs to create new Pods, it uses its Pod template.

The link a ReplicaSet has to its Pods is via the Pods’ metadata.ownerReferences field, which specifies what resource the current object is owned by. All Pods acquired by a ReplicaSet have their owning ReplicaSet’s identifying information within their ownerReferences field. It’s through this link that the ReplicaSet knows of the state of the Pods it is maintaining and plans accordingly.

A ReplicaSet identifies new Pods to acquire by using its selector. If there is a Pod that has no OwnerReference or the OwnerReference is not a Controller and it matches a ReplicaSet’s selector, it will be immediately acquired by said ReplicaSet.

A ReplicaSet is responsible for running specified number of pod replicas at any given time.

Here is simple yaml configuration file for ReplicaSet:

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: example
  labels:
    app: guestbook
    tier: eg
spec:
  replicas: 1 #provided appreciated amount of replicas
  selector:
    matchLabels:
      tier: eg
  template:
    metadata:
      labels:
        tier: eg
    spec:
      containers:
      - name: php
        image: gcr.io/google_samples/gb-frontend:v3

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 David Maze
Solution 2 Alexander Taylor
Solution 3 Alexander Taylor
Solution 4