'How buffer/delay incoming HTTP request until down backend wakes up?

https://companyA.acme.org/custom/api/endpoint1 is hosted on a dedicated ec2 instance https://companyB.acme.org/another/custom/apiendpoint is hosted on a dedicated ec2 instance

(both ec2 instance are using the same core app, each customer can customize the catalog of API endpoints)

Most of the time those ec2 instances are idle, so we secretly want to stop them, but we don't want the customer to care about the instance being up or not. We can accept a 2 sec delay on response timing when instance needs to be wake up before answering the customer API call

My idea is to intercept all incoming HTTP request and buffer them before routing + forwarding. I need a delay to be able to check if a backend matching the subdomain is up or not and wake it up if it is down.

Anyone knows any existing proxy / load balancing solution able to buffer / queue HTTP requests, then allows to do some custom magic (in order to launch the right ec2 instance), then forward the request based on Origin/Referer ?

(the answer being probably every existing proxy for the last part )

I was thinking about the following:

But I am not sure on how NGINX will handle the request buffering / forwarding while I start the ec2 host.

Bonus question : how not to waste any time forwarding the request in case the backend is already up ?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source