'How to build a character-level siamise network using Keras

I am trying to build a Siamese neural network on characters-level using Keras, to learn if two names are similars or not.

So my two inputs X1 and X2 are a 3-D matrices:
X[number_of_cases, max_length_of_name, total_number_of_chars_in_DB]

In the real case:

  • number_of_cases = 5000
  • max_length_of_name = 50
  • total_number_of_chars_in_DB = 38

I have one output binary matrix of size y[number_of_cases].

So for example: print(X1[:3, :2])

Will give the following result:

[[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]

 [[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]

 [[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]]

I use the following code to build my model:

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM, SimpleRNN
from keras.models import Model
import keras
from keras import backend as k

input_1 = Input(shape=(X1.shape[1], X1.shape[2],))
input_2 = Input(shape=(X2.shape[1], X2.shape[2],))

lstm1 = Bidirectional(LSTM(256, input_shape=(X1.shape[1], X1.shape[2],), return_sequences=False))
lstm2 = Bidirectional(LSTM(256, input_shape=(X1.shape[1], X1.shape[2],), return_sequences=False))

l1_norm = lambda x: 1 - k.abs(x[0] - x[1])

merged = Lambda(function=l1_norm, output_shape=lambda x: x[0], name='L1_distance')([lstm1, lstm2])

predictions = Dense(1, activation = 'sigmoid', name='classification_layer')(merged)

model = Model([input_1, input_2], predictions)
model.compile(loss = 'binary_crossentropy', optimizer="adam", metrics=["accuracy"])

model.fit([X1, X2], validation_split=0.1, epochs = 20,shuffle=True, batch_size = 256)

I am getting the following error:

Layer L1_distance was called with an input that isn't a symbolic tensor.

I think that the error is that I need to tell the L1_distance layer to use the output of the two precedent LSTM layers, but I do not know how to do it.

The second question, is, am I obliged to add an embedding layer before the LSTM, even in the scenario of character level network?

Thank you.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source