'HealthCheck with TensorFlow Serving

We are setting up a fleet/cluster of Tensorflow Serving (TFS) instances with ECS/FarGate (using a Docker image with TFS). The instances will be served by a load balancer (AWS ALB) which needs to check the health status of each instance in the fleet. TFS does not provide health-checks out of the box, so we need to implement our own. Furthermore, we are using gRPC for incoming inference requests.

We have several options:

  • Implement gRPC health checks (discussion here) as a separate process within the container, but TFS already owns the gRPC port connection inside the instance. I don't believe that two processes can share the gRPC port.
  • Depend on the built-in Docker HealthCheck. But I don't think that the ALB will accept a Docker HealthCheck.
  • Bring up a small web server in the container to implement the health check over HTTP web interface.

But even if I do solve the above problem, how should I check the status of the TFS process itself. Is there a call to TFS that I can use to confirm it is running properly?

Fundamentally, how does one monitor the health of TFS instances for using a health check for load balancing?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source