'Classification report not working well while doing text classification with LSTM model

trying to get classification report with LSTM on data with text and label and this report states there is no 1's which is not true because the label consist of 0 and 1.

here is the report result:classification report

nd here is code I am doing for this purpose:

X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size = 0.20)

#lstm model
model = Sequential()
model.add(Embedding(MAX_NB_WORDS, EMBEDDING_DIM, input_length=X.shape[1]))
model.add(SpatialDropout1D(0.2))
model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(4, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
history = model.fit(X_train, Y_train, epochs=epochs, batch_size=batch_size,validation_split=0.1,callbacks=[EarlyStopping(monitor='val_loss', patience=3,  min_delta=0.0001)])

Y_pred=model.predict(X_test)
print(classification_report(Y_test.argmax(axis=1), Y_pred.argmax(axis=1)))


Solution 1:[1]

This is an error of interpretation of the report:

  • This report does say that there are instances with 1 as true label, you can see that from the column "support" which shows 164883 instances (more than half).
  • However the performance is zero everywhere for class 1: this means that the classifier never predicts the class. This is obviously a sign that something went completely wrong.

Note: it's rare that text classification task is balanced. If there was any resampling involved, it would be a mistake to evaluate on the resampled data.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Erwan