'Load raster from bytestream and set its CRS

What I want to do : load a raster from an s3 bucket in memory and set its CRS to 4326 (it has no crs set)

What I have so far:

import boto3
import rasterio
from rasterio.crs import CRS

bucket = 'my bucket'
key = 'my_key'
s3 = boto3.client('s3')
file_byte_string = s3.get_object(Bucket=bucket,Key=key)['Body'].read()
with rasterio.open(BytesIO(file_byte_string), mode='r+') as ds:
  crs = CRS({"init": "epsg:4326"}) 
  ds.crs = crs

I have found the way to structure my code here

Set CRS for a file read with rasterio

It works if I give it a path to a local file but it does not work for bytestreams.

The error I get when I have '+r' mode:

rasterio.errors.PathError: invalid path '<_io.BytesIO object at 0x7fb4503ca4d0>'

The error I get when I have 'r' mode:

rasterio.errors.DatasetAttributeError: read-only attribute

Is there a way to load bytestream in r+ mode so that I can set/modify the CRS?



Solution 1:[1]

You can achieve this if you wrap your bytes in a NamedTemporaryFile. This and some alternatives are explained in the docs.

import boto3
import rasterio
from rasterio.crs import CRS
import tempfile

bucket = 'asdf'
key = 'asdf'


s3 = boto3.client('s3')
file_byte_string = s3.get_object(Bucket=bucket,Key=key)['Body'].read()

with tempfile.NamedTemporaryFile() as tmpfile:
    tmpfile.write(file_byte_string)
    with rasterio.open(tmpfile.name, "r+") as ds:
         crs = CRS({"init": "epsg:4326"}) 
         ds.crs = crs

An important limitation of this approach is that you have to download the whole file into memory from S3, as opposed to mounting the file remotely like this.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1