'Sagemaker endpoint with tensorflow container ignoring the inference.py file
I'm using a tensorflow model which is saved as follows:
tf-model
00000123
assets
variables
variables.data-00000-of-00001
variables.index
keras_metadata.pb
saved_model.pb
The tf model is getting picked up and it's working as an endpoint, and when I associate a predictor object inside of sagemaker with the endpoint and run it it returns what I expect.
However, I want to inference it with a POST json and I want a POST json back, the same as with sklearn or xgb or pytorch endpoints.
I tried to implement this on the inference.py which I'm passing as an entry point, but no matter what I try, the endpoint just seems to ignore the inference.py script.
I use the script almost exactly as is given at the end of this page: https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/using_tf.html
I've tried both versions (input / output handler + just handler), I've tried leaving it in the sagemaker environment, I've tried packaging it up with the tar.gz file, I've put it in an s3 bucket and set up pointers and environment variables (via tensorflowmodel kwarg) to it, but no matter what I try it just ignores the inference.py.
I know it ignores it because I have made minor edits on the application/json part of the input handler and none of these edits show up, even when I change it so it only takes text/csv or something else the changes do not reflect.
What am I doing wrong? How can I get the tensorflow serverless environment to return a POST of the output instead of saving the output to s3 as per it's default behaviour?
Solution 1:[1]
Your untared model artifacts should look something like this: (where infrence.py is located in the code/ folder)
model1
|--[model_version_number]
|--variables
|--saved_model.pb
model2
|--[model_version_number]
|--assets
|--variables
|--saved_model.pb
code
|--lib
|--external_module
|--inference.py
Kindly see this link for more information
That being said, by default application/json is supported for requests and responses for the SageMaker TensorFlow Serving container.
Thus, you can send a JSON in (with the correct tensor shape your model expects) and receive JSON response.
This question/answer here explains hows to make the Post requests using Postman.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Marc K |
