'Keras Incompatible layer error in Embedding
I am training a model to get the embedding of texts using Keras but running into dimension incompatibility issue. I have tried the things suggested in different posts but havent been able to resolve the error. Below is my code-
text_embedding_size = 30
text_tknzr_vocab_size = 700
text_tknzr_max_length = 557
text_embedder_model = Sequential()
text_embedder_model.add(Embedding(input_dim=700, output_dim=30, input_length=557, name="text_embedding"))
text_embedder_model.add(Flatten())
text_embedder_model.add(Dense(15, activation="relu"))
text_embedder_model.add(Dense(2, activation='sigmoid'))
text_embedder_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=[tf.keras.metrics.KLDivergence()])
print(text_embedder_model.summary())
The model summary is
Model: "sequential_8"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
text_embedding (Embedding) (None, 557, 30) 21000
_________________________________________________________________
flatten_7 (Flatten) (None, 16710) 0
_________________________________________________________________
dense_11 (Dense) (None, 15) 250665
_________________________________________________________________
dense_12 (Dense) (None, 3) 48
=================================================================
Total params: 271,713
Trainable params: 271,713
Non-trainable params: 0
On running the below line...
text_embedder_model.fit(x = tr_padded_docs, y=dummy_y_tr, epochs = 10, batch_size = 512)
I recieve the following error
ValueError: Input 0 of layer dense_11 is incompatible with the layer: expected axis -1 of input shape to have value 16710 but received input with shape [None, 21000]
I have looked up a few posts and a tutorials but cant locate the error.
Solution 1:[1]
I have replicated the same model with imdb_reviews dataset. You can modify the sentence length while padding as below:
vocab_size = 700
embedding_dim = 30
max_length = 557
trunc_type='post'
oov_tok = "<OOV>"
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
tokenizer = Tokenizer(num_words = vocab_size, oov_token=oov_tok)
tokenizer.fit_on_texts(training_sentences)
sequences = tokenizer.texts_to_sequences(training_sentences)
# This will change sentence length as defined while padding
padded = pad_sequences(sequences,maxlen=max_length, truncating=trunc_type)
testing_sequences = tokenizer.texts_to_sequences(testing_sentences)
testing_padded = pad_sequences(testing_sequences,maxlen=max_length)
padded.shape, testing_padded.shape
Output:
((25000, 557), (25000, 557))
Also change the unit=1 in final Dense layer as you are using Sigmoid function which is used for binary classification along with by changing the loss function to 'binary_crossentropy'
text_embedder_model.add(Dense(1, activation='sigmoid'))
text_embedder_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=[tf.keras.metrics.KLDivergence()])
print(text_embedder_model.summary())
Output:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
text_embedding (Embedding) (None, 557, 30) 21000
flatten (Flatten) (None, 16710) 0
dense (Dense) (None, 15) 250665
dense_1 (Dense) (None, 1) 16
=================================================================
Total params: 271,681
Trainable params: 271,681
Non-trainable params: 0
_________________________________________________________________
None
Now you can train the model using padded dataset:
text_embedder_model.fit(padded, training_labels_final, epochs=10, batch_size=516, validation_data=(testing_padded, testing_labels_final))
Output:
Epoch 1/10
49/49 [==============================] - 5s 27ms/step - loss: 0.6748 - kullback_leibler_divergence: 0.3321 - val_loss: 0.6284 - val_kullback_leibler_divergence: 0.2766
Epoch 2/10
49/49 [==============================] - 1s 20ms/step - loss: 0.5010 - kullback_leibler_divergence: 0.2490 - val_loss: 0.4044 - val_kullback_leibler_divergence: 0.1922
Epoch 3/10
49/49 [==============================] - 1s 20ms/step - loss: 0.3582 - kullback_leibler_divergence: 0.1780 - val_loss: 0.3638 - val_kullback_leibler_divergence: 0.2079
Epoch 4/10
49/49 [==============================] - 1s 17ms/step - loss: 0.3142 - kullback_leibler_divergence: 0.1572 - val_loss: 0.3604 - val_kullback_leibler_divergence: 0.1282
Epoch 5/10
49/49 [==============================] - 1s 18ms/step - loss: 0.2786 - kullback_leibler_divergence: 0.1410 - val_loss: 0.3489 - val_kullback_leibler_divergence: 0.1564
Epoch 6/10
49/49 [==============================] - 1s 18ms/step - loss: 0.2445 - kullback_leibler_divergence: 0.1225 - val_loss: 0.3518 - val_kullback_leibler_divergence: 0.1744
Epoch 7/10
49/49 [==============================] - 1s 19ms/step - loss: 0.2131 - kullback_leibler_divergence: 0.1069 - val_loss: 0.3653 - val_kullback_leibler_divergence: 0.1625
Epoch 8/10
49/49 [==============================] - 1s 18ms/step - loss: 0.1841 - kullback_leibler_divergence: 0.0924 - val_loss: 0.3883 - val_kullback_leibler_divergence: 0.1442
Epoch 9/10
49/49 [==============================] - 1s 19ms/step - loss: 0.1581 - kullback_leibler_divergence: 0.0792 - val_loss: 0.3979 - val_kullback_leibler_divergence: 0.1878
Epoch 10/10
49/49 [==============================] - 1s 15ms/step - loss: 0.1307 - kullback_leibler_divergence: 0.0657 - val_loss: 0.4297 - val_kullback_leibler_divergence: 0.1623
<keras.callbacks.History at 0x7f11f0149b10>
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | TFer2 |
