'TensorFlow -- Invalid argument: assertion failed: [Condition x == y did not hold element-wise:]

I am trying to create an encoder-decoder RNN that adds sequence_lengths as an input to the model, to tell the model to ignore padding (essentially masking). The problem is when I do this, I get a really weird error message that I can't make sense of.

The code mostly follows the examples given in the TensorFlow documentation on BasicDecoders

https://www.tensorflow.org/addons/api_docs/python/tfa/seq2seq/BasicDecoder

And in 67 in this GitHub page

https://github.com/ageron/handson-ml2/blob/master/16_nlp_with_rnns_and_attention.ipynb

Error Message

InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument:  assertion failed: [Condition x == y did not hold element-wise:] [x (sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/Shape_1:0) = ] [32 7] [y (sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/strided_slice:0) = ] [32 4]
     [[node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/assert_equal_1/Assert/Assert (defined at var/folders/py/f2mvp4b141l29q795tkjtxr40000gn/T/ipykernel_22910/309539710.py:1) ]]
     [[Func/gradient_tape/model_11/basic_decoder_11/decoder/while/model_11/basic_decoder_11/decoder/while_grad/body/_225/input/_684/_200]]
  (1) Invalid argument:  assertion failed: [Condition x == y did not hold element-wise:] [x (sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/Shape_1:0) = ] [32 7] [y (sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/strided_slice:0) = ] [32 4]
     [[node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/assert_equal_1/Assert/Assert (defined at var/folders/py/f2mvp4b141l29q795tkjtxr40000gn/T/ipykernel_22910/309539710.py:1) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_55181]

Function call stack:
train_function -> train_function

Model

encoder_inputs = keras.layers.Input(shape=[None], dtype=np.int32)
decoder_inputs = keras.layers.Input(shape=[None], dtype=np.int32)
sequence_lengths = keras.layers.Input(shape=[], dtype=np.int32)

embeddings = keras.layers.Embedding(input_vocab_size, embedding_size)
encoder_embeddings = embeddings(encoder_inputs)
decoder_embeddings = embeddings(decoder_inputs)

encoder = keras.layers.LSTM(512, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_embeddings)
encoder_state = [state_h, state_c]

sampler = tfa.seq2seq.sampler.TrainingSampler()

decoder_cell = keras.layers.LSTMCell(512)
output_layer = keras.layers.Dense(target_vocab_size)
decoder = tfa.seq2seq.basic_decoder.BasicDecoder(decoder_cell, sampler,
                                                 output_layer=output_layer)
final_outputs, final_state, final_sequence_lengths = decoder(
    decoder_embeddings, initial_state=encoder_state,
    sequence_length=sequence_lengths)
Y_proba = tf.nn.softmax(final_outputs.rnn_output)

model = keras.models.Model(
    inputs=[encoder_inputs, decoder_inputs, sequence_lengths],
    outputs=[Y_proba])

Variables and other information

X_train.shape = (24575, 35)
y_train.shape = (24575, 7)
X_decoder.shape = (24575, 7)
seq_length.shape = (24575,)

X_train and y_train are tokenised and padded arrays.

X_decoder is y_train shifted along by one. i.e. X_decoder = np.c_[np.zeros((y_train.shape[0], 1)), y_train[:, :-1]]

seq_length is an array of the index of the eos token in the y_train variable.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source