'Read h5 file using AWS boto3
I am trying to read h5 file from AWS S3 using boto3.
client = boto3.client('s3',key ='key')
result = client.get_object(Bucket='bucket', Key='file')
with h5py.File(result['Body'], 'r') as f:
    data = f
TypeError: expected str, bytes or os.PathLike object, not StreamingBody
Any idea?
h5py version is 2.10, boto3 version is 1.7.58
The same question was here, but no answer...
Solution 1:[1]
The h5py.File() command is expecting a path to a local file on disk. However, you are passing it the data in memory.
You can download the file with:
import boto3
s3_client = boto3.client('s3')
s3_client.download_file('bucket', 'key', 'filename')
with h5py.File('filename', 'r') as f:
    data = f
Solution 2:[2]
A working solution using tempfile for temporary storage. This streams the model data from your s3 bucket into a temp storage and sets it into a variable.
import tempfile
from keras import models
import boto3
# Creating the low level functional client
client = boto3.client(
    's3',
    aws_access_key_id = 'ACCESS_KEY_ID',
    aws_secret_access_key = 'ACCESS_SECRET_KEY',
    region_name = 'us-east-1'
)
# Create the S3 object
response_data = client.get_object(
    Bucket = 'bucket-name',
    Key = 'model/model.h5'
)
model_name='model.h5'
response_data=response_data['Body']
response_data=response_data.read()
#save byte file to temp storage
with tempfile.TemporaryDirectory() as tempdir:
    with open(f"{tempdir}/{model_name}", 'wb') as my_data_file:
        my_data_file.write(response_data)
        #load byte file from temp storage into variable
        gotten_model=models.load_model(f"{tempdir}/{model_name}")
print(gotten_model.summary())
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source | 
|---|---|
| Solution 1 | John Rotenstein | 
| Solution 2 | Samuel Tosan Ayo | 

 amazon-web-services
amazon-web-services amazon-s3
amazon-s3