'No Module named "Fastai" when trying to deploy fastai model on sagemaker
I have trained and built a Fastai(v1) model and exported it as a .pkl file. Now i want to deploy this model for inference in Amazon Sagemaker
Following the Sagemaker documentation for Pytorch model [https://sagemaker.readthedocs.io/en/stable/frameworks/pytorch/using_pytorch.html#write-an-inference-script][1]
Steps taken
Folder structure
Sagemaker/
export.pkl
code/
inference.py
requirement.txt
requirement.txt
spacy==2.3.4
torch==1.4.0
torchvision==0.5.0
fastai==1.0.60
numpy
Command i used to create the zip file
cd Sagemaker/
tar -czvf /tmp/model.tar.gz ./export.pkl ./code
This would generate a model.tar.gz file and i upload it to S3 bucket
To deploy this i used the python sagemaker SDK
from sagemaker.pytorch import PyTorchModel
role = "sagemaker-role-arn"
model_path = "s3 key for the model.tar.gz file that i created above"
pytorch_model = PyTorchModel(model_data=model_path,role=role,`entry_point='inference.py',framework_version="1.4.0", py_version="py3")
predictor = pytorch_model.deploy(instance_type='ml.c5.large', initial_instance_count=1)
After executing the above code i see that the model is created in sagemaker and deployed but i end up getting an error running the inference
botocore.errorfactory.ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from primary with message "No module named 'fastai'
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/sagemaker_inference/transformer.py", line 110, in transform
self.validate_and_initialize(model_dir=model_dir)
File "/opt/conda/lib/python3.6/site-packages/sagemaker_inference/transformer.py", line 157, in validate_and_initialize
self._validate_user_module_and_set_functions()
File "/opt/conda/lib/python3.6/site-packages/sagemaker_inference/transformer.py", line 170, in _validate_user_module_and_set_functions
user_module = importlib.import_module(user_module_name)
File "/opt/conda/lib/python3.6/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 994, in _gcd_import
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 678, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/opt/ml/model/code/inference.py", line 2, in <module>
from fastai.basic_train import load_learner, DatasetType, Path
ModuleNotFoundError: No module named 'fastai'
Clearly the fastai module doesn't get downloaded what is the cause for this and what am i doing wrong in this case
Solution 1:[1]
To troubleshoot such issues, you should check the CloudWatch logs for the endpoint.
You should check the logs first to see if the requirements.txt was found and installed or if there were any dependency errors.
For packaging the model and your inference scripts, it's recommended to have two files:
model.tar.gzwhich has the model and the model files.sourcedir.tar.gzand use SageMaker environment variableSAGEMAKER_SUBMIT_DIRECTORYto point to the file location on S3s3://bucket/prefix/sourcedir.tar.gz. You can point to the file name usingSAGEMAKER_PROGRAMto beinference.py.
Note: when you use source_dir in PyTorchModel, the SDK will package the source_dir, upload it to s3 and define SAGEMAKER_SUBMIT_DIRECTORY.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Abdelrahman Maharek |
