'Ensuring at most a single instance of job executing on Kubernetes and writing into Postgresql
I have a Python program that I am running as a Job on a Kubernetes cluster every 2 hours. I also have a webserver that starts the job whenever user clicks a button on a page.
I need to ensure that at most only one instance of the Job is running on the cluster at any given time.
Given that I am using Kubernetes to run the job and connecting to Postgresql from within the job, the solution should somehow leverage these two. I though a bit about it and came with the following ideas:
- Find a setting in Kubernetes that would set this limit, attempts to start second instance would then fail. I was unable to find this setting.
- Create a shared lock, or mutex. Disadvantage is that if job crashes, I may not unlock before quitting.
- Kubernetes is running etcd, maybe I can use that
- Create a 'lock' table in Postgresql, when new instance connects, it checks if it is the only one running. Use transactions somehow so that one wins and proceeds, while others quit. I have not yet thought this out, but is should work.
- Query kubernetes API for a label I use on the job, see if there are some instances. This may not be atomic, so more than one instance may slip through.
What are the usual solutions to this problem given the platform choice I made? What should I do, so that I don't reinvent the wheel and have something reliable?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
