'LSTM performed pretty bad with spacy-embedded tensors(closed)

I found the problem!
Just because the loss function is wrong.

LSTM.compile(optimizer=tf.keras.optimizers.Adam(),loss="binary_crossentropy",metrics=['accuracy'])

Then everything will work fine.(the val-accuracy is approximately 0.8)

I would like to classify twitter posts (either 0 or 1). My process is first to convert each words in sentences into spacy vector, such as this

x_train[:5] # I removed stop words.
>>0           deeds reason #earthquake may allah forgive
1               forest fire near la ronge sask. canada
2    residents asked 'shelter place' notified offic...
3    13,000 people receive #wildfires evacuation or...
4    got sent photo ruby #alaska smoke #wildfires p...
dtype: object

def spacy_emb(data):
    Embedded=[]
    for i in data:
        tmp_lst=[]
        tmp_nlp=nlp(i)
        for j in tmp_nlp:
            tmp_lst.append(list(j.vector))
        Embedded.append(tmp_lst)
    return Embedded

emb_x_train=spacy_emb(x_train) # the trained set
emb_x_val=spacy_emb(x_val) # the validation set

Thus, each word were mapped into a specific 300-dimension vector, like

x_train[0] 
>>'deeds reason #earthquake may allah forgive'

pd.DataFrame(emb_x_train[0]).T
>>  0   1   2   3   4   5   6
0   -0.464950   -0.17729    -0.444670   -0.502520   -0.042501   -0.932850   -0.811760
1   -0.449840   0.17692 0.695360    0.092205    0.090773    0.052161    0.026418
2   -0.193010   -0.45867    0.427480    0.256320    -0.119180   0.134250    -0.506870
3   -0.421040   -0.10056    0.219060    0.088787    0.123720    0.026673    -0.115170
4   -0.416040   -0.31111    0.117570    -0.302910   -0.193020   0.276940    0.032999
... ... ... ... ... ... ... ...
295 0.457600    0.14850 -0.413770   -0.561560   0.108880    -0.707360   -0.209070
296 0.564490    -0.23429    -0.278680   -0.525640   -0.326820   0.450920    0.090611
297 0.095385    -0.17910    -0.079391   0.290760    -0.560680   0.885570    0.518240
298 0.304370    0.20167 -0.527170   -0.353490   0.191770    -0.258720   -0.100550
299 -0.439690   0.12226 -0.124130   -0.090766   -0.029525   -0.257910   -0.360320
300 rows × 7 columns

and then, for LSTM,

# for training,
tf_x_train=tf.ragged.constant(emb_x_train)
tf_y_train=tf.constant(y_train,shape=[len(y_train),1])

# for validation,
tf_x_val=tf.ragged.constant(emb_x_val)
tf_y_val=tf.constant(y_val,shape=[len(y_val),1])

LSTM=tf.keras.Sequential([
    tf.keras.layers.Input(shape=[66,300],dtype=tf.float64,ragged=True),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(128,return_sequences=True)),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64,return_sequences=False)),
    tf.keras.layers.Dense(1,activation="sigmoid")
])
LSTM.compile(optimizer=tf.keras.optimizers.Adam(),
loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),metrics=['accuracy'])

But what I didn't understand is why the performance was so bad.

history=LSTM.fit(tf_x_train,tf_y_train,epochs=10,validation_data=(tf_x_val,tf_y_val))
>>Epoch 1/10
227/227 [==============================] - 10s 43ms/step - loss: 0.0000e+00 - accuracy: 0.5704 - val_loss: 0.0000e+00 - val_accuracy: 0.5684
Epoch 2/10
227/227 [==============================] - 10s 42ms/step - loss: 0.0000e+00 - accuracy: 0.5704 - val_loss: 0.0000e+00 - val_accuracy: 0.5684
Epoch 3/10
227/227 [==============================] - 10s 42ms/step - loss: 0.0000e+00 - accuracy: 0.5704 - val_loss: 0.0000e+00 - val_accuracy: 0.5684
Epoch 4/10
227/227 [==============================] - 10s 42ms/step - loss: 0.0000e+00 - accuracy: 0.5704 - val_loss: 0.0000e+00 - val_accuracy: 0.5684
Epoch 5/10
227/227 [==============================] - 10s 43ms/step - loss: 0.0000e+00 - accuracy: 0.5704 - val_loss: 0.0000e+00 - val_accuracy: 0.5684
Epoch 6/10
227/227 [==============================] - 10s 43ms/step - loss: 0.0000e+00 - accuracy: 0.5704 - val_loss: 0.0000e+00 - val_accuracy: 0.5684
Epoch 7/10
227/227 [==============================] - 10s 43ms/step - loss: 0.0000e+00 - accuracy: 0.5704 - val_loss: 0.0000e+00 - val_accuracy: 0.5684
Epoch 8/10
227/227 [==============================] - 10s 42ms/step - loss: 0.0000e+00 - accuracy: 0.5704 - val_loss: 0.0000e+00 - val_accuracy: 0.5684
Epoch 9/10
227/227 [==============================] - 10s 43ms/step - loss: 0.0000e+00 - accuracy: 0.5704 - val_loss: 0.0000e+00 - val_accuracy: 0.5684
Epoch 10/10
227/227 [==============================] - 10s 42ms/step - loss: 0.0000e+00 - accuracy: 0.5704 - val_loss: 0.0000e+00 - val_accuracy: 0.5684

Did I do anything wrong?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source