'Setting different input and output sizes for each point in dataset when using `ImageDataGenerator()`

I am building a CSRNet largely based on the code in this github. Note that this is a fully convolutional network (no dense layers), hence it is supposed to be able to receive images of non-fixed size.

I wanted to use an ImageDataGenerator() for data augmentation. The CSRNet reduces the input size by a factor of 1/8. Hence, I have produced density maps with 1/8th the size of the images. I am using the following code to print the shapes of two sample images and their respective density maps:

path_to_images = 'Images/grapes'
path_to_densities = 'Densities/grapes'

for f in glob.glob(os.path.join(path_to_images, '*.png')):
    im = np.asarray(Image.open(f).convert('RGB'))
    print(im.shape)

for f in glob.glob(os.path.join(path_to_densities, '*.bmp')):
    im = np.asarray(Image.open(f))
    print(im.shape)

The output is as follows:

(1365, 2048, 3)
(1536, 2048, 3)
(192, 256)
(170, 256)

After this, I am initializing two coupled generators to feed the network as in:

path_to_images = os.path.dirname(path_to_images)
path_to_densities = os.path.dirname(path_to_densities)
# create augmentation training generators for the input images and label images
image_gen = ImageDataGenerator(rotation_range=20, width_shift_range=0.2, height_shift_range=0.2, brightness_range=(0.5,1.5),
                                                 fill_mode="reflect", horizontal_flip=True ,zoom_range=0.3)
density_gen = ImageDataGenerator(rotation_range=20, width_shift_range=0.2, height_shift_range=0.2, brightness_range=(0.5,1.5),
                                                  fill_mode="reflect", horizontal_flip=True ,zoom_range=0.3)
image_generator = image_gen.flow_from_directory(path_to_images, batch_size=batch_size, class_mode=None, seed=seed)
density_generator = density_gen.flow_from_directory(path_to_densities, batch_size=batch_size, class_mode=None, seed=seed, color_mode="grayscale")
# Combine the image and label generator
generator = zip(image_generator, density_generator)

im1,im2 = next(generator)

print(f"im1 shape = {im1.shape}")
print(f"im2 shape = {im2.shape}")

The given output is:

Found 2 images belonging to 2 classes.
Found 2 images belonging to 2 classes.
im1 shape = (2, 256, 256, 3)
im2 shape = (2, 256, 256, 1)

I am expecting the shapes above for the images. But the default output size for flow_from_directory() is 255x255

I have tried using target_size parameter for flow_from_directory(), however this will fix the generator's output size, and since the input image is of variable size, I don't know to overcome this.

How can I use the ImageDataGenerator() to apply augmentations with different size images for both output images, input images, and between them?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source