'Sagemaker endpoint with tensorflow container ignoring the inference.py file

I'm using a tensorflow model which is saved as follows:

tf-model
    00000123
        assets
        variables
            variables.data-00000-of-00001
            variables.index
        keras_metadata.pb
        saved_model.pb

The tf model is getting picked up and it's working as an endpoint, and when I associate a predictor object inside of sagemaker with the endpoint and run it it returns what I expect.

However, I want to inference it with a POST json and I want a POST json back, the same as with sklearn or xgb or pytorch endpoints.

I tried to implement this on the inference.py which I'm passing as an entry point, but no matter what I try, the endpoint just seems to ignore the inference.py script.

I use the script almost exactly as is given at the end of this page: https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/using_tf.html

I've tried both versions (input / output handler + just handler), I've tried leaving it in the sagemaker environment, I've tried packaging it up with the tar.gz file, I've put it in an s3 bucket and set up pointers and environment variables (via tensorflowmodel kwarg) to it, but no matter what I try it just ignores the inference.py.

I know it ignores it because I have made minor edits on the application/json part of the input handler and none of these edits show up, even when I change it so it only takes text/csv or something else the changes do not reflect.

What am I doing wrong? How can I get the tensorflow serverless environment to return a POST of the output instead of saving the output to s3 as per it's default behaviour?

Solution 1:^[1]

Your untared model artifacts should look something like this: (where infrence.py is located in the code/ folder)

model1
    |--[model_version_number]
        |--variables
        |--saved_model.pb
model2
    |--[model_version_number]
        |--assets
        |--variables
        |--saved_model.pb
code
    |--lib
        |--external_module
    |--inference.py

Kindly see this link for more information

That being said, by default application/json is supported for requests and responses for the SageMaker TensorFlow Serving container.

Thus, you can send a JSON in (with the correct tensor shape your model expects) and receive JSON response.

This question/answer here explains hows to make the Post requests using Postman.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Marc K

'Sagemaker endpoint with tensorflow container ignoring the inference.py file

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]