'Issue with xarray.open_raterio() img file and multiprocessing Pool

I am trying to use mutliprocessing Pool.map() to speed up my code. In the function where I have computation occurring for each process I reference an xarray.DataArray that was opened using xarray.open_rasterio(). However, I receive errors similar to this:

rasterio.errors.RasterioIOError: Read or write failed. /net/home_stu/cfite/data/CDL/2019/2019_30m_cdls.img, band 1: IReadBlock failed at X offset 190, Y offset 115: Unable to open external data file: /net/home_stu/cfite/data/CDL/2019/

I assume this is some issue related to the same file being referenced simultaneously while another worker is opening it too? I use DataArray.sel() to select small portions of the raster grid that I work with since the entire .img file is way to big to load all at once. I have tried opening the .img file in the main code and then just referencing to it in my function, and I've tried opening/closing it in the function that is being passed to Pool.map() - and receive errors like this regardless. Is my file corrupted, or will I just not be able to work with this file using multiprocessing Pool? I am very new to working with multiprocessing, so any advice is appreciated. Here is an example of my code:

import pandas as pd
import xarray as xr
import numpy as np
from multiprocessing import Pool

def select_grid(x,y):
    ds = xr.open_rasterio('myrasterfile.img') #opening large file with xarray
    grid = ds.sel(x=slice(x,x+50), y=slice(y,y+50))
    ds.close()
    return grid

def myfunction(row):
    x = row.x
    y = row.y
    mygrid = select_grid(x,y)
    my_calculation = mygrid.sum() #example calculation, but really I am doing multiple calculations
    my_calculation.to_csv('filename.csv')

with Pool(30) as p: 
    p.map(myfunction, list_of_df_rows)


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source