'TensorFlow -- Invalid argument: assertion failed: [Condition x == y did not hold element-wise:]
I am trying to create an encoder-decoder RNN that adds sequence_lengths as an input to the model, to tell the model to ignore padding (essentially masking). The problem is when I do this, I get a really weird error message that I can't make sense of.
The code mostly follows the examples given in the TensorFlow documentation on BasicDecoders
https://www.tensorflow.org/addons/api_docs/python/tfa/seq2seq/BasicDecoder
And in 67 in this GitHub page
https://github.com/ageron/handson-ml2/blob/master/16_nlp_with_rnns_and_attention.ipynb
Error Message
InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: assertion failed: [Condition x == y did not hold element-wise:] [x (sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/Shape_1:0) = ] [32 7] [y (sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/strided_slice:0) = ] [32 4]
[[node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/assert_equal_1/Assert/Assert (defined at var/folders/py/f2mvp4b141l29q795tkjtxr40000gn/T/ipykernel_22910/309539710.py:1) ]]
[[Func/gradient_tape/model_11/basic_decoder_11/decoder/while/model_11/basic_decoder_11/decoder/while_grad/body/_225/input/_684/_200]]
(1) Invalid argument: assertion failed: [Condition x == y did not hold element-wise:] [x (sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/Shape_1:0) = ] [32 7] [y (sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/strided_slice:0) = ] [32 4]
[[node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/assert_equal_1/Assert/Assert (defined at var/folders/py/f2mvp4b141l29q795tkjtxr40000gn/T/ipykernel_22910/309539710.py:1) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_55181]
Function call stack:
train_function -> train_function
Model
encoder_inputs = keras.layers.Input(shape=[None], dtype=np.int32)
decoder_inputs = keras.layers.Input(shape=[None], dtype=np.int32)
sequence_lengths = keras.layers.Input(shape=[], dtype=np.int32)
embeddings = keras.layers.Embedding(input_vocab_size, embedding_size)
encoder_embeddings = embeddings(encoder_inputs)
decoder_embeddings = embeddings(decoder_inputs)
encoder = keras.layers.LSTM(512, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_embeddings)
encoder_state = [state_h, state_c]
sampler = tfa.seq2seq.sampler.TrainingSampler()
decoder_cell = keras.layers.LSTMCell(512)
output_layer = keras.layers.Dense(target_vocab_size)
decoder = tfa.seq2seq.basic_decoder.BasicDecoder(decoder_cell, sampler,
output_layer=output_layer)
final_outputs, final_state, final_sequence_lengths = decoder(
decoder_embeddings, initial_state=encoder_state,
sequence_length=sequence_lengths)
Y_proba = tf.nn.softmax(final_outputs.rnn_output)
model = keras.models.Model(
inputs=[encoder_inputs, decoder_inputs, sequence_lengths],
outputs=[Y_proba])
Variables and other information
X_train.shape = (24575, 35)
y_train.shape = (24575, 7)
X_decoder.shape = (24575, 7)
seq_length.shape = (24575,)
X_train and y_train are tokenised and padded arrays.
X_decoder is y_train shifted along by one. i.e. X_decoder = np.c_[np.zeros((y_train.shape[0], 1)), y_train[:, :-1]]
seq_length is an array of the index of the eos token in the y_train variable.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
