'How to avoid R raster "Failure during raster IO" when using lapply() and mean() on a list

I have ~150 high resolution (0.5cm/pixel) single-image captures from a DJI Phantom. I create floating-point indices from them using R's raster package, and I need to then save the mean of those to a CSV with the filename.

All works until the mean extraction.

#Example image
class      : RasterLayer 
dimensions : 3648, 5472, 19961856  (nrow, ncol, ncell)
resolution : 1, 1  (x, y)
extent     : 0, 5472, 0, 3648  (xmin, xmax, ymin, ymax)
crs        : NA 
source     : ./Temp/DJI_0318_gindex.tif 
names      : DJI_0318_gindex 
values     : 0, 1  (min, max)

> object.size(example)
12584 bytes

extractor_fun<- function(x){ # A function to calculate the mean and stdev and put them in a list
  r <- raster(x) #read element i of rasterlist into R
  # name<-gsub("_ExG.tif", )
  val <- getValues(r) #get raster values
  m <- c(substr(basename(x),start=1,stop=8),c((mean(val,na.rm=T)), (sd(val,na.rm=T))))#remove NAs and compute mean and stdev. pairs with filename
  return(m)
}

ExG.list <- lapply(ExG, extractor_fun) # Apply the function to the list of images
df1 <- data.frame(do.call(rbind,ExG.list)) #convert list to data frame
colnames(df1) <- c("File","ExG", "ExG_sd") # rename the columns 

This gives this error:

Error in rgdal::getRasterData(con, offset = offs, region.dim = reg, band = object@data@band) :
Failure during raster IO

The PC has 32GB of RAM and a quad core Xeon. R has 20GB available to it:

rasterOptions(maxmemory = 2e+20) 

I can successfully run this exact same script, using the exact same RAM limit, and the exact same dataset on another computer that has 64GB of RAM and an 8 core Ryzen 1700x.

How can I manage memory to deploy this to lower powered machines? Could I parallelize the process to extract images on each core while appending to the same list?



Solution 1:[1]

The error your are getting is not related to RAM or computer power. It occurs when a file is read

Error in rgdal::getRasterData(con, offset = offs, region.dim = reg, band = object@data@band) :
Failure during raster IO

This means that there a corrupted file. To find out which file(s), use a loop instead of lapply.

And here is an alternative approach to achieve your goals:

library(terra)
x <- rast(ExG)
mean <- global(x, "mean", na.rm=TRUE)
std <- global(x, "sd", na.rm=TRUE)
d <- data.frame(name=substr(ExG,1,8), mean=mean, std=std)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1