'ValueError: Layer weight shape (775, 768) not compatible with provided weight shape (775, 100, 768)

20220422: I've read some tutorials on other embeddings and now I guess my question is actually how to create an embedding matrix using Bert word embeddings?

20220420: I want to use BERT for only word embedding and then use LSTM to do text classification. Now, I used 'Bert-as-service' to get the bert word embedding, its shape is (775, 100, 768), that is (batch_size, maxlen, embeded_size). But apparently, it's not compatible with the layer weight.

I used the bert word embedding as the weight in the embedding layer (based on https://www.kaggle.com/code/shujian/blend-of-lstm-and-cnn-with-4-embeddings-1200d/notebook

But I'm new to python, and don't know how to convert the shape of bert word embeddings to (max_feature, embeded_size), so that I can concatenate bert word embeddings with other embeddings like GloVe like this tutorial.

train_trail = pd.read_csv('C:/Users/Ariel/trial/train.csv')
test_trail = pd.read_csv('C:/Users/Ariel/trial/test.csv')

embed_size = 768 # how big is each word vector
max_features = 775 # how many unique words to use (i.e num rows in embedding vector)

####!!!change later!!!
maxlen = 100 # max number of words in a question to use 

## split to train and val
from sklearn.model_selection import train_test_split
train_trail, val_trial = train_test_split(train_trail, test_size=0.1, random_state=2018)

train_trail_X = train_trail["processed_text"]
val_trial_X = val_trial ["processed_text"]
test_trail_X = test_trail["processed_text"]



## Tokenize the sentences
tokenizer = Tokenizer(num_words=max_features)
tokenizer.fit_on_texts(list(train_trail_X))
train_trail_X = tokenizer.texts_to_sequences(train_trail_X)
val_trial_X = tokenizer.texts_to_sequences(val_trial_X)
test_trail_X = tokenizer.texts_to_sequences(test_trail_X)

## Pad the sentences 
train_trail_X = pad_sequences(train_trail_X, maxlen=maxlen)
val_trial_X = pad_sequences(val_trial_X, maxlen=maxlen)
test_trail_X = pad_sequences(test_trail_X, maxlen=maxlen)

## Get the target values
train_trail_y = train_trail['label'].values
val_trial_y = val_trial['label'].values

from bert_serving.client import BertClient
bc = BertClient()
texts = train_trail['processed_text'].tolist()
texts2 = [s.split() for s in texts]
vecs = bc.encode(texts2, is_tokenized=True)

The model

#Building sentiment model
def getModel():
    embedding_layer = Embedding(input_dim = max_features,
                                output_dim = embed_size,
                                weights= [vec_squeeze],
                                trainable=False)

    model = Sequential([
        embedding_layer,
        Bidirectional(LSTM(100, dropout=0.3, return_sequences=True)),
        Bidirectional(LSTM(100, dropout=0.3, return_sequences=True)),
        Conv1D(100, 5, activation='relu'),
        GlobalMaxPool1D(),
        Dense(16, activation='relu'),
        Dense(1, activation='sigmoid'),
    ],
    name="Sentiment_Model")
    return model

training_model = getModel()
training_model.summary()

Then, the error occured:

training_model = getModel()

training_model.summary()

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-102-fdbf69968d36> in <module>
----> 1 training_model = getModel()
      2 training_model.summary()

<ipython-input-101-d82f78e4293f> in getModel()
     15         Dense(1, activation='sigmoid'),
     16     ],
---> 17     name="Sentiment_Model")
     18     return model
     19 

D:\Anaconda3\envs\bertuse\lib\site-packages\keras\engine\sequential.py in __init__(self, layers, name)
     90         if layers:
     91             for layer in layers:
---> 92                 self.add(layer)
     93 
     94     @property

D:\Anaconda3\envs\bertuse\lib\site-packages\keras\engine\sequential.py in add(self, layer)
    164                     # and create the node connecting the current layer
    165                     # to the input layer we just created.
--> 166                     layer(x)
    167                     set_inputs = True
    168                 else:

D:\Anaconda3\envs\bertuse\lib\site-packages\keras\engine\base_layer.py in __call__(self, inputs, **kwargs)
    437                 # Load weights that were specified at layer instantiation.
    438                 if self._initial_weights is not None:
--> 439                     self.set_weights(self._initial_weights)
    440 
    441             # Raise exceptions in case the input is not compatible

D:\Anaconda3\envs\bertuse\lib\site-packages\keras\engine\base_layer.py in set_weights(self, weights)
   1070                                  str(pv.shape) +
   1071                                  ' not compatible with '
-> 1072                                  'provided weight shape ' + str(w.shape))
   1073             weight_value_tuples.append((p, w))
   1074         K.batch_set_value(weight_value_tuples)

ValueError: Layer weight shape (775, 768) not compatible with provided weight shape (775, 100, 768)

I'm having a hard time on this task so any help would be greatly appreciated. Many thanks in advance!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source