'Why does BERT Model fail to find an option that matches my input positional arguments?

While attempting an NLP exercise, I tried to make use of BERT architecture to get a good training model. So I defined a function that builds and compiles the model using BERT as the layer. However, upon trying to execute the function and actually build the model, I get an error that the BERT Layer could not find an option to match my input positional arguments.

The dimensions of my positional arguments are [None, 160] but the BERT Layer seemingly expects them to be [None, None]. How do I resolve this?

To reproduce my problem:

These are the libraries I imported:

import tensorflow as tf
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Model
import tensorflow_hub as hub

Next, I defined a function for the model as follows:

# Build and compile the model

def build_model(bert_layer, max_len = 512):
    input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
    input_mask = Input(shape=(max_len,), dtype=tf.int32, name="input_mask")
    segment_ids = Input(shape=(max_len,), dtype=tf.int32, name="segment_ids")

    pooled_output, sequence_output = bert_layer([input_word_ids, input_mask, segment_ids])
    clf_output = sequence_output[:, 0, :]
    out = Dense(1, activation='sigmoid')(clf_output)
    
    model = Model(inputs=[input_word_ids, input_mask, segment_ids], outputs=out)
    model.compile(Adam(lr=1e-5), loss='binary_crossentropy', metrics=['accuracy'])
    
    return model

Next, I downloaded the BERT architecture and instantiated the bert_layer as follows:

module_url = "https://tfhub.dev/tensorflow/bert_en_uncased_L-24_H-1024_A-16/4"
bert_layer = hub.KerasLayer(module_url, trainable=True)

Finally, I tried to build the model using the build_model function and bert_layer as seen below:

model = build_model(bert_layer, max_len=160)
model.summary()

But this returns an error which I think implies that the dimensions of my input are different from the dimensions that are required. The error is as seen below:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-42-516b88804394> in <module>
----> 1 model = build_model(bert_layer, max_len=160)
      2 model.summary()

<ipython-input-41-713013238e2f> in build_model(bert_layer, max_len)
      6     segment_ids = Input(shape=(max_len,), dtype=tf.int32, name="segment_ids")
      7 
----> 8     pooled_output, sequence_output = bert_layer([input_word_ids, input_mask, segment_ids])
      9     clf_output = sequence_output[:, 0, :]
     10     out = Dense(1, activation='sigmoid')(clf_output)

~\Anaconda3\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py in __call__(self, inputs, *args, **kwargs)
    840                     not base_layer_utils.is_in_eager_or_tf_function()):
    841                   with auto_control_deps.AutomaticControlDependencies() as acd:
--> 842                     outputs = call_fn(cast_inputs, *args, **kwargs)
    843                     # Wrap Tensors in `outputs` in `tf.identity` to avoid
    844                     # circular dependencies.

~\Anaconda3\lib\site-packages\tensorflow_core\python\autograph\impl\api.py in wrapper(*args, **kwargs)
    235       except Exception as e:  # pylint:disable=broad-except
    236         if hasattr(e, 'ag_error_metadata'):
--> 237           raise e.ag_error_metadata.to_exception(e)
    238         else:
    239           raise

ValueError: in converted code:
    relative to C:\Users\Wolemercy\Anaconda3\lib\site-packages:

    tensorflow_hub\keras_layer.py:237 call  *
        result = smart_cond.smart_cond(training,
    tensorflow_core\python\framework\smart_cond.py:59 smart_cond
        name=name)
    tensorflow_core\python\saved_model\load.py:436 _call_attribute
        return instance.__call__(*args, **kwargs)
    tensorflow_core\python\eager\def_function.py:457 __call__
        result = self._call(*args, **kwds)
    tensorflow_core\python\eager\def_function.py:494 _call
        results = self._stateful_fn(*args, **kwds)
    tensorflow_core\python\eager\function.py:1822 __call__
        graph_function, args, kwargs = self._maybe_define_function(args, kwargs)
    tensorflow_core\python\eager\function.py:2150 _maybe_define_function
        graph_function = self._create_graph_function(args, kwargs)
    tensorflow_core\python\eager\function.py:2041 _create_graph_function
        capture_by_value=self._capture_by_value),
    tensorflow_core\python\framework\func_graph.py:915 func_graph_from_py_func
        func_outputs = python_func(*func_args, **func_kwargs)
    tensorflow_core\python\eager\def_function.py:358 wrapped_fn
        return weak_wrapped_fn().__wrapped__(*args, **kwds)
    tensorflow_core\python\saved_model\function_deserialization.py:262 restored_function_body
        "\n\n".join(signature_descriptions)))

    ValueError: Could not find matching function to call loaded from the SavedModel. Got:
      Positional arguments (3 total):
        * [<tf.Tensor 'inputs:0' shape=(None, 160) dtype=int32>, <tf.Tensor 'inputs_1:0' shape=(None, 160) dtype=int32>, <tf.Tensor 'inputs_2:0' shape=(None, 160) dtype=int32>]
        * True
        * None
      Keyword arguments: {}
    
    Expected these arguments to match one of the following 4 option(s):
    
    Option 1:
      Positional arguments (3 total):
        * {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask')}
        * False
        * None
      Keyword arguments: {}
    
    Option 2:
      Positional arguments (3 total):
        * {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_mask')}
        * False
        * None
      Keyword arguments: {}
    
    Option 3:
      Positional arguments (3 total):
        * {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_mask')}
        * True
        * None
      Keyword arguments: {}
    
    Option 4:
      Positional arguments (3 total):
        * {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask')}
        * True
        * None
      Keyword arguments: {}

My expectation was that the model would be compiled successfully. Instead, I got this error.



Solution 1:[1]

First of all you need the bert preprocessor

bert_preprocessor = hub.load("https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3")

This will give you the : input_word_ids , input_mask , segment_ids. you simply pass your text to the bert_preprocessor

then add your bert model as a KerasLayer

bert_model = hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_L-24_H-1024_A-16/4")

as for fine tunning your model :

def bert_funtional_API(seq_length):

 text_input = [tf.keras.layers.Input(shape=(),dtype=tf.string)]

 tokenize = hub.KerasLayer(bert_preprocessor.tokenize)
 tokenized_inputs = [tokenize(segment) for segment in input1]
 bert_pack_inputs = hub.KerasLayer(bert_preprocessor.bert_pack_inputs,
 arguments=dict(seq_length=seq_length))
 encoder_inputs = bert_pack_inputs(tokenized_inputs)
 bert_input = bert_encoder(encoder_inputs)
 pooled_output = bert_input['pooled_output']
 sequence_output = bert_input['sequence_output']

 output = Dense(1,activation = 'sigmoid')(sequence_output)

 model = Model(inputs = [text_input], outputs = output)

 model.compile(Adam(lr=1e-5), loss='binary_crossentropy', metrics=['accuracy'])

 return model



 

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1