'how to define input_shape in keras model properly
New to tensorflow.
Following is the datasets I am working on:
abalone_train = pd.read_csv(
"https://storage.googleapis.com/download.tensorflow.org/data/abalone_train.csv",
names=["Length", "Diameter", "Height", "Whole weight", "Shucked weight",
"Viscera weight", "Shell weight", "Age"])
abalone_train.head()
abalone_cols = abalone_train.columns
y_train = abalone_train[abalone_cols[-1]]
x_train = abalone_train[abalone_cols[:-1]]
I tried 2 iterations of model:
1st iteration:
model = tf.keras.models.Sequential([
tf.keras.layers.InputLayer(input_shape = (None,7)),
tf.keras.layers.Dense(20, activation='relu'),
tf.keras.layers.Dense(10, activation='relu'),
tf.keras.layers.Dense(2, activation = 'relu'),
tf.keras.layers.Dense(1, activation = 'relu'),
]
)
model.compile(optimizer = 'sgd', loss = 'mean_squared_error')
x_train_np = np.array(x_train)
y_train_np = np.array(y_train)
modelcheck = model.fit(x_train_np, y_train_np, epochs = 5)
2nd iteration:
Similar to 1st one, but I only changed the input_shape:
model = tf.keras.models.Sequential([
tf.keras.layers.InputLayer(input_shape = (7,)),
tf.keras.layers.Dense(20, activation='relu'),
tf.keras.layers.Dense(10, activation='relu'),
tf.keras.layers.Dense(2, activation = 'relu'),
tf.keras.layers.Dense(1, activation = 'relu'),
]
)
model.compile(optimizer = 'sgd', loss = 'mean_squared_error')
x_train_np = np.array(x_train)
y_train_np = np.array(y_train)
modelcheck = model.fit(x_train_np, y_train_np, epochs = 5)
It looks like that in the first iteration, I get constant loss of 108.0 across iterations and epochs:
104/104 [==============================] - 1s 4ms/step - loss: 108.2235 Epoch 2/5 104/104 [==============================] - 0s 4ms/step - loss: 108.2235 Epoch 3/5 104/104 [==============================] - 0s 4ms/step - loss: 108.2235 Epoch 4/5 104/104 [==============================] - 0s 4ms/step - loss: 108.2235 Epoch 5/5 104/104 [==============================] - 0s 4ms/step - loss: 108.2235
In the 2nd one, the code is working fine and I am getting a loss as follows:
Epoch 1/5 104/104 [==============================] - 1s 5ms/step - loss: 13.9729 Epoch 2/5 104/104 [==============================] - 0s 4ms/step - loss: 8.0497 Epoch 3/5 104/104 [==============================] - 0s 4ms/step - loss: 7.4067 Epoch 4/5 104/104 [==============================] - 0s 4ms/step - loss: 6.9215 Epoch 5/5 104/104 [==============================] - 0s 5ms/step - loss: 6.5436
I don't seem to understand how keras is treating these two iterations differently. From what I have read, even if I put 'None' at the beginning, it should not matter as it is the 'batch_size'.
Am I missing something here?! Any guidance would be really helpful!
Solution 1:[1]
In the input layer you don't define the batch size. You just define the shape of the input, excluding the batch size. Keras automatically adds the None value in the front of the shape of each layer, which is later replaced by the batch size.
So in the 1st iteration, you have an incorrect input shape. You want to have the 7 inputs in a vector of shape (7, 1) because your data is made up of rows of 7 elements. Thus, the correct input shape is (7,).
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
