'Why is making shuffle=False on validation set giving better results in confusion matrix and classification report than shuffle=True?

If I'm giving shuffle=False while creating a test or validation dataset,

test_dataset = test_image_gen.flow_from_directory(test_path,
                                          target_size=(125,125),
                                          batch_size=batch_size,
                                          class_mode='binary',
                                          shuffle=False)

while making predictions using predict_generator, I'm getting better confusion matrix and classification report when shuffle is False.

[[947  53]
 [ 25 975]]



    precision    recall  f1-score   support

           0       0.97      0.95      0.96      1000
           1       0.95      0.97      0.96      1000

    accuracy                           0.96      2000
   macro avg       0.96      0.96      0.96      2000
weighted avg       0.96      0.96      0.96      2000

But if I set shuffle=True the results are very disheartening.

test_dataset = test_image_gen.flow_from_directory(test_path,
                                          target_size=(125,125),
                                          batch_size=batch_size,
                                          class_mode='binary',
                                          shuffle=True)
[[495 505]
 [477 523]]



    precision    recall  f1-score   support

           0       0.51      0.49      0.50      1000
           1       0.51      0.52      0.52      1000

    accuracy                           0.51      2000
   macro avg       0.51      0.51      0.51      2000
weighted avg       0.51      0.51      0.51      2000

Solution 1:^[1]

In your case, the problem with setting the shuffle=True is that if you shuffle on your validation set, the results will be chaotic. It happens that the prediction is correct but compared to wrong indices can lead to misleading results, just like it happened in your case.

Always shuffle=True on the training set and shuffle=False on the validation set and test set.

Original answer : accuracy-reduced-when-shuffle-set-to-true-in-keras-fit-generator

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Kartik Sikka

'Why is making shuffle=False on validation set giving better results in confusion matrix and classification report than shuffle=True?

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]