'VertexAI Batch Inference Failing for Custom Container Model

I'm having trouble executing VertexAI's batch inference, despite endpoint deployment and inference working perfectly. My TensorFlow model has been trained in a custom Docker container with the following arguments:

aiplatform.CustomContainerTrainingJob(
        display_name=display_name,
        command=["python3", "train.py"],
        container_uri=container_uri,
        model_serving_container_image_uri=container_uri,
        model_serving_container_environment_variables=env_vars,
        model_serving_container_predict_route='/predict',
        model_serving_container_health_route='/health',
        model_serving_container_command=[
            "gunicorn",
            "src.inference:app",
            "--bind",
            "0.0.0.0:5000",
            "-k",
            "uvicorn.workers.UvicornWorker",
            "-t",
            "6000",
        ],
        model_serving_container_ports=[5000],
)

I have a Flask endpoint defined for predict and health essentially defined below:

@app.get(f"/health")
def health_check_batch():
    return 200

@app.post(f"/predict")
def predict_batch(request_body: dict):
    pred_df = pd.DataFrame(request_body['instances'],
                           columns = request_body['parameters']['columns'])
    # do some model inference things
    return {"predictions": predictions.tolist()}

As described, when training a model and deploying to an endpoint, I can successfully hit the API with JSON schema like:

{"instances":[[1,2], [1,3]], "parameters":{"columns":["first", "second"]}}

This also works when using the endpoint Python SDK and feeding in instances/parameters as functional arguments.

However, I've tried performing batch inference with a CSV file and a JSONL file, and every time it fails with an Error Code 3. I can't find logs on why it failed in Logs Explorer either. I've read through all the documentation I could find and have seen other's successfully invoke batch inference, but haven't been able to find a guide. Does anyone have recommendations on batch file structure or the structure of my APIs? Thank you!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source