'Loading fasttext binary model from s3 fails
I am hosting a pretrained fasttext model on s3 (uncompressed) and I am trying to load it in a lambda function. I am using the gensim.models.fasttext module to load the model:
from gensim.models.fasttext import load_facebook_vectors
def load_model(obj):
model = load_facebook_vectors(obj["path"])
with obj["path"] being the s3 path, but I keep getting the following error:
"errorMessage": "fileno"
"errorType": "UnsupportedOperation"
"stackTrace": [
...
" File \"/var/task/gensim/models/fasttext.py\", line 784, in load_facebook_vectors\n full_model = _load_fasttext_format(path, encoding=encoding, full_model=False)\n"
" File \"/var/task/gensim/models/fasttext.py\", line 808, in _load_fasttext_format\n m = gensim.models._fasttext_bin.load(fin, encoding=encoding, full_model=full_model)\n"
" File \"/var/task/gensim/models/_fasttext_bin.py\", line 348, in load\n vectors_ngrams = _load_matrix(fin, new_format=new_format)\n"
" File \"/var/task/gensim/models/_fasttext_bin.py\", line 282, in _load_matrix\n matrix = np.fromfile(fin, _FLOAT_DTYPE, count)\n"
]
Solution 1:[1]
The documentation for load_facebook_vectors says:
This function uses the smart_open library to open the path. The path may be on a remote host (e.g. HTTP, S3, etc).
There are examples of accessing S3 objects at smart_open. I have not personally tried this but I wanted to make sure you eliminated all options before deciding to forcibly download the object and access it locally.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | jarmod |
