'Tensorflow: ConcatOp : Dimension 1 in both shapes must be equal: shape[0] = [16,101,13] vs. shape[1] = [16,52,13] [Op:ConcatV2] name: concat

I'm doing Named Entity Recognition (NER) using transformers from Hugging Face - Transformers

The model is implemented exactly as stated in the guide but when I try to predict the tags using test dataset I'm getting an error

InvalidArgumentError: ConcatOp : Dimension 1 in both shapes must be equal: shape[0] = [16,101,13] vs. shape1 = [16,52,13] [Op:ConcatV2] name: concat

With following stack trace:

---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
/Users/raffaysajjad/Library/CloudStorage/OneDrive-Personal/MS Computer Science/Semester 4 (Spring 2022)/CS 5316 - Natural Language Processing/Assignments/Assignment 4/Assignment4_Part4_20030001.ipynb Cell 32' in <module>
----> 1 predictions = model.predict(tf_test_set)

File /usr/local/lib/python3.9/site-packages/keras/utils/traceback_utils.py:67, in filter_traceback.<locals>.error_handler(*args, **kwargs)
     65 except Exception as e:  # pylint: disable=broad-except
     66   filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67   raise e.with_traceback(filtered_tb) from None
     68 finally:
     69   del filtered_tb

File /usr/local/lib/python3.9/site-packages/tensorflow/python/framework/ops.py:7186, in raise_from_not_ok_status(e, name)
   7184 def raise_from_not_ok_status(e, name):
   7185   e.message += (" name: " + name if name is not None else "")
-> 7186   raise core._status_to_exception(e) from None

This is how I implemented the model:

from transformers import TFAutoModelForTokenClassification model = TFAutoModelForTokenClassification.from_pretrained("distilbert-base-uncased", num_labels=len(label_list))

from transformers import create_optimizer
batch_size = 16
num_train_epochs = 3
num_train_steps = (len(tokenized_wnut["train"]) // batch_size) * num_train_epochs
optimizer, lr_schedule = create_optimizer(
    init_lr=2e-5,
    num_train_steps=num_train_steps,
    weight_decay_rate=0.01,
    num_warmup_steps=0,
)
model.compile(optimizer=optimizer)
model.fit(
    tf_train_set,
    validation_data=tf_validation_set,
    epochs=num_train_epochs,
)

Attempt at prediction that resulted in aforementioned error:

tf_test_set = tokenized_wnut["test"].to_tf_dataset(
    columns=["attention_mask", "input_ids", "labels"],
    shuffle=True,
    batch_size=16,
    collate_fn=data_collator,
)
predictions = model.predict(tf_test_set)

Can somebody help me figure out the issue here?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source