'Whenever i try to print classification report, in accuracy column it prints some other value 0.50 but my accuracy is 0.96

  1. Importing libraries
import matplotlib.pyplot as plt
import seaborn as sns
import keras
from keras.layers import *
from keras.models import *
from sklearn.metrics import classification_report, confusion_matrix
from keras.preprocessing import image
from keras.callbacks import EarlyStopping
from keras.callbacks import ModelCheckpoint
from keras.callbacks import LearningRateScheduler
from keras.applications.densenet import DenseNet121
import numpy as np # linear algebra
import pandas as pd
  1. Directories+ model training+testing
import os
for dirname, _, filenames in os.walk(r'Research Work\Data'):
    for filename in filenames:
        os.path.join(dirname, filename)

my_data_dir = r'Research Work\Data'
test_path = my_data_dir+'/test/'
train_path = my_data_dir+'/train/'

image_size = (224, 224,3)
batch_size=32,

train_datagen = image.ImageDataGenerator(
    rescale = 1./255,
)
test_datagen = image.ImageDataGenerator(rescale = 1./255)
early_stop = EarlyStopping(monitor='val_loss',patience=100, verbose = 1)
batch_size = 32

train_generator = train_datagen.flow_from_directory(
    train_path,
    target_size = (224,224),
    batch_size = 32,
    class_mode = 'categorical')
from keras import backend as K
def recall_m(y_true, y_pred):
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
    recall = true_positives / (possible_positives + K.epsilon())
    return recall

def precision_m(y_true, y_pred):
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
    precision = true_positives / (predicted_positives + K.epsilon())
    return precision

def f1_m(y_true, y_pred):
    precision = precision_m(y_true, y_pred)
    recall = recall_m(y_true, y_pred)
    return 2*((precision*recall)/(precision+recall+K.epsilon()))

train_generator.class_indices

test_image_gen = test_datagen.flow_from_directory(
    test_path,
    target_size = (224,224),
    batch_size = 32,
    class_mode = 'categorical')
class_name=test_image_gen.class_indices
epochs = 100
stepsperepoch=9
validationsteps=1

annealer = LearningRateScheduler(lambda x: 1e-3 * 0.95 ** x)

es = EarlyStopping(monitor='val_accuracy', mode='max', verbose=1, patience=100)
mc = ModelCheckpoint("own.h5", monitor='val_loss',save_best_only=True, mode='min',verbose=1)

input_t = Input(shape=(224, 224, 3))

model = DenseNet121(
    include_top=True,
    weights=None,
    input_tensor=input_t,
    input_shape=None,
    pooling=None,
    classes=3,
)

model.compile(loss='categorical_crossentropy',optimizer="RMSprop",metrics=['accuracy',f1_m,precision_m, recall_m])

model.summary()

hist = model.fit_generator(
    train_generator,
    epochs=epochs,
    callbacks=[annealer,mc,es],
    steps_per_epoch=stepsperepoch,
    validation_data=test_image_gen,
    validation_steps = validationsteps,
)
metrics=model.evaluate(test_image_gen)
print ("Validation Loss = " + str(metrics[0]))
print ("Validation Accuracy = " + str(metrics[1]))
print ("Validation F1 Score = " + str(metrics[2]))
print ("Validation Precision = " + str(metrics[3]))


predictions = np.argmax(model.predict(test_image_gen), axis=-1)
print(predictions)
print(f'Testing loss: {metrics[0]}')
print(f'Testing accuracy: {metrics[1]}')
  1. Printing classification report The accuracy is 96% but in classification report it shows 50%. which value it prints
print(classification_report(test_image_gen.classes, predictions))
print(confusion_matrix(test_image_gen.classes, predictions))
sns.heatmap(confusion_matrix(test_image_gen.classes, predictions), annot=True)

Output

                 precision    recall  f1-score   support

           0       0.06      0.05      0.06       111
           1       0.24      0.25      0.24       301
           2       0.66      0.66      0.66       838

     accuracy                           0.50      1250
macro average       0.32      0.32      0.32      1250
weighted aveg       0.51      0.50      0.51      1250


Solution 1:[1]

I think you predictions migh be wrong. Correct me if I'm totally wrong but you use numpy.argmax which should return the the indices of the maximum values. But you only want the prediction values. This could lead to the big difference in the classification report. You could just use;

predictions = model.predict(test_image_gen)

Solution 2:[2]

it's because on ImageDataGenerator shuffle=True is set by default. and test_image_gen.classes will give unshuffle data, this is what causes the inconsistency i think

you can set your validation generator a shuffle=False

validation_image_gen = image_gen.flow_from_directory(train_path,
                                               target_size=target_size,
                                                color_mode=color_mode,
                                               batch_size=batch_size,
                                               class_mode='binary',
                                               shuffle=False,
                                               subset='validation')

Regards

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 desertnaut
Solution 2 Ronin