'What's the best pattern for multiple background tasks in C# + kubernetes?
It seems to me that there must be a pattern on how to do this. It seems like a common (therefore solved) problem.
So, I have a long running task I need to do rarely on production (hours probably). It's to accomplish data loads and data patches for a live production API. I want it done as quickly as possible, yet I don't want it to run too quickly. Too quickly would be either overloading the database or too many multiple threads/instances hitting the same piece of data and having to back off and retry due to concurrency issues.
The main service (API) in this domain is hosted in kubernetes (in the cloud). How do I achieve that this?
Two main questions bother me: how to correctly do this in the background and how to correctly scale horizontally.
Rough idea
The first thing which comes to mind is to have a web service (ASP.NET) exposing two routes - to add another task and to stop all tasks.
Background tasks?
Here I'm split between using HostingEnvironment.QueueBackgroundWorkItem and having a separate IHostedService (BackgroundService) which will keep a List of tuples (task and cancellation token). Adding a task will just start new tasks with TaskFactory, add it to the List with the cancelation token and not await it. That part seems dirty.
The QueueBackgroundWorkItem is simple, it's out of the box. The BackgroundService gives more control. I could for example set a hard limit on the number of background tasks or cancel tasks one by one via an additional API route.
A third option is to have the API push an event to Rabbit, and then have all instances consume messages from Rabbit as a signal to start doing work. Then it doesn't have to be async fire and forget, but it feels a bit heavy involving Rabbit into a project which otherwise doesn't need it.
Scaling?
With any of the proposed solutions - the load would go up with additional calls to the API, this would lead to the k8s horizontal autoscaler to add new instances, but those new instances wouldn't reduce the load on the existing instance nor would they start doing any work. I'd have to make additional API calls, and hope that round-robinning will spread the load correctly, optionally by looking into the number of running background tasks and not starting more than X (to be empirically determined).
Doesn't feel exactly clean.
Alternatives
It feels like I'm shoehorning this problem into tools which I've used before (ASP.NET + Rabbit). Is this a correct approach?
I've considered Azure functions, but they have a 10min limit.
I've considered k8s tasks, but triggering and stopping and scaling those would require someone with k8s access and experience, and I'd like not to have that limitation regarding who can operate this.
What haven't I considered?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
