'About input_shape in keras.layers from tensorflow
I am a beginner for tensorflow. I had just tried to fit a simple LeNet-5 for mnist data.
My training and test data are first in Numpy format. i.e., (60000, 28, 28). Then I set my model as below.
model_LeNet5 = Sequential([
layers.Conv2D(6, kernel_size=3, strides=1, input_shape=(28, 28, 1)),
layers.MaxPooling2D(pool_size=2,strides=2),
layers.ReLU(),
layers.Conv2D(16,kernel_size=3,strides=1),
layers.MaxPooling2D(pool_size=2,strides=2),
layers.ReLU(),
layers.Flatten(),
layers.Dense(120, activation='relu'),
layers.Dense(84, activation='relu'),
layers.Dense(10)
])
I could understand that I get success when I set input_shape as (28,28) or train_images.shape[1:], but I can not understand that input_shape = (28,28,1) is also worked (shown as code above).
It seems that there is an inconsistancy between the shape of data and setting of input size (i.e., [60000,28,28] vs [28,28,1]). Also the broadcast rule may not link [60000,28,28] with [28,28,1]. Thanks for anyone who will explain the mechanism of input_shape.
Solution 1:[1]
A single grayscale image can be represented using a two-dimensional (2D) NumPy array or a tensor. Since there is only one channel in a grayscale image, we don’t need an extra dimension to represent the color channel. The two dimensions represent the height and width of the image.
A batch of 3 grayscale images can be represented using a three-dimensional (3D) NumPy array or a tensor. Here, we need an extra dimension to represent the number of images.
For more information, check out this article on towardsdatascience.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Amirhossein Rezaei |

