'ValueError: Input 0 of layer Conv1_pad is incompatible with the layer: expected ndim=4, found ndim=3. Full shape received: [224, 224, 3]

I am quite new to python and machine learning so bear with me.

I am following a tutorial on training a Neural Network Classifier on ImageNet using TensorFlow 2. (specifically this one: https://medium.com/analytics-vidhya/how-to-train-a-neural-network-classifier-on-imagenet-using-tensorflow-2-ede0ea3a35ff) and I encounter this error shown in the title. I tried a lot of the methods provided in other threads in stack overflow regarding this issue. I tried changing the input shape, reshaping it but was unsuccessful. Maybe I did it incorrectly or didn't understand the code enough.

I made a lot of changes but the error message in the last step of the output would always be the same. Hope someone can tell me what I am doing wrong.

My setup:

-Tensorflow 2.3.1

-ILSVRC2012_img_train.tar and ILSVRC2012_img_val.tar from imagenet

Full code:

import tensorflow as tf
import tensorflow_datasets as tfds
import os
import numpy as np
import matplotlib.pyplot as plt

labels_path = tf.keras.utils.get_file('ImageNetLabels.txt','https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt')
imagenet_labels = np.array(open(labels_path).read().splitlines())

data_dir = 'datasets/imagenet/'
write_dir = 'psando/tf-imagenet-dirs'

download_config = tfds.download.DownloadConfig(
                      extract_dir=os.path.join(write_dir, 'extracted'),
                      manual_dir=data_dir
                  )
download_and_prepare_kwargs = {
    'download_dir': os.path.join(write_dir, 'downloaded'),
    'download_config': download_config,
}
ds = tfds.load('imagenet2012_subset', 
               data_dir=os.path.join(write_dir, 'data'),         
               split='train', 
               shuffle_files=False, 
               download=True, 
               as_supervised=True,
               download_and_prepare_kwargs=download_and_prepare_kwargs)

def resize_with_crop(image, label):
    i = image
    i = tf.cast(i, tf.float32)
    i = tf.image.resize_with_crop_or_pad(i, 224, 224)
    i = tf.keras.applications.mobilenet_v2.preprocess_input(i)
    return (i, label)

ds = ds.map(resize_with_crop)
    
pretrained_model = tf.keras.applications.MobileNetV2(include_top=True, 
                                                     weights='imagenet')
pretrained_model.trainable = False
pretrained_model.compile(optimizer='adam', 
                         loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), 
                         metrics=['accuracy'])
decode_predictions = tf.keras.applications.mobilenet_v2.decode_predictions

result = pretrained_model.evaluate(ds)
print(dict(zip(pretrained_model.metrics_names, result)))


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source