'HealthCheck with TensorFlow Serving
We are setting up a fleet/cluster of Tensorflow Serving (TFS) instances with ECS/FarGate (using a Docker image with TFS). The instances will be served by a load balancer (AWS ALB) which needs to check the health status of each instance in the fleet. TFS does not provide health-checks out of the box, so we need to implement our own. Furthermore, we are using gRPC for incoming inference requests.
We have several options:
- Implement gRPC health checks (discussion here) as a separate process within the container, but TFS already owns the gRPC port connection inside the instance. I don't believe that two processes can share the gRPC port.
- Depend on the built-in Docker HealthCheck. But I don't think that the ALB will accept a Docker HealthCheck.
- Bring up a small web server in the container to implement the health check over HTTP web interface.
But even if I do solve the above problem, how should I check the status of the TFS process itself. Is there a call to TFS that I can use to confirm it is running properly?
Fundamentally, how does one monitor the health of TFS instances for using a health check for load balancing?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
