'OSError: image file is truncated - despite image file not being corrupt

I'm trying to process around 100 000 images in order to feed them into a CNN but I'm running into some errors. Is there a way to skip errors in a for loop using some try except block? I don't want the program to stop when it hits a corrupt image or something.

This is my code and it works good when the images are good:

importedImages = []

for filename in files:
    original = load_img(filename, target_size=(imgs_model_width, imgs_model_height))
    numpy_image = img_to_array(original)
    image_batch = np.expand_dims(numpy_image, axis=0)
    importedImages.append(image_batch)

The files list contains strings of all the filenames in the current directory where all the files are stored. In short, I loop through all the images, convert them to numpy arrays, expand dimensions and store them in the importedImages list. When I reach image 6337 I get the error OSError: image file is truncated. Reading online, this error seems to occur when the image file is "corrupt", whatever that means. When I open that particular image it works fine.

Is there a simple fix for this or do I manually need to remove the corrupt images every time I hit an error?



Solution 1:[1]

I think I found a solution for skipping the errors at least:

importedImages = []
badImages = []

for filename in files:
    try:
        original = load_img(filename, target_size=(imgs_model_width, imgs_model_height))
        numpy_image = img_to_array(original)
        image_batch = np.expand_dims(numpy_image, axis=0)
        importedImages.append(image_batch)
        pass
    except OSError:
        badImages.append(filename)
        continue

Note: Answer provided by OP on question section.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1