'Softmax not giving probability of each class

I'm just doing a simple DNN on the Fashion MNIST dataset and I have my last layer set to 10 units with softmax activation. What I'm expecting is to see a probability of each class that it can possibly be but all I'm getting is 1 as if its a binary classifier. Not sure what I'm doing wrong!

import tensorflow as tf
from tensorflow import keras
import numpy as np

#Loading data and setting the training, validation and test samples
fashion_mnist = keras.datasets.fashion_mnist
(X_train_full, y_train_full),(X_test,y_test) = fashion_mnist.load_data()
X_valid, X_train = X_train_full[:5000]/255. , X_train_full[5000:]/255.
y_valid, y_train = y_train_full[:5000], y_train_full[5000:]


#Creating the model
model = keras.models.Sequential([
    keras.layers.Flatten(input_shape=[28,28]),
    keras.layers.Dense(300, activation = 'relu'),
    keras.layers.Dense(100, activation = 'relu'),
    keras.layers.Dense(10, activation = 'softmax'),
])

model.compile(loss="sparse_categorical_crossentropy", optimizer="sgd", metrics=["accuracy"])
history = model.fit(X_train, y_train, epochs=30, validation_data=(X_valid, y_valid))

# Predicting values
X_new = X_test[:3]
y_proba = model.predict(X_new)
y_proba.round(2)

This is where I'm expecting to see a few classes with percentages (decimals) but I'm getting all 1s... I looked through all 1000 test samples but for simplicity I'm only displaying the first 3:

array([[0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]], dtype=float32)

What am I not getting? or doing wrong?



Solution 1:[1]

Try checking the prediction with X_valid as you have trained your model by subsetting the dataset into (X_train, y_train) and (X_valid, y_valid):

# Predicting values
X_new = X_valid[:3]
y_proba = model.predict(X_new)
y_proba

Output:

array([[8.7486797e-15, 4.1651571e-14, 1.4356388e-15, 2.7562514e-19,
        6.6444322e-15, 2.3569682e-10, 8.2472345e-15, 3.0666047e-03,
        4.6038006e-14, 9.9693334e-01],
       [9.9992633e-01, 1.2232946e-16, 2.8763867e-05, 6.6998540e-10,
        3.3127880e-09, 4.4620761e-18, 4.4978275e-05, 7.2699694e-15,
        9.3659594e-12, 3.5588103e-24],
       [9.8151851e-01, 1.6080153e-04, 7.4726825e-07, 1.8097689e-02,
        1.4498403e-08, 1.7963254e-10, 2.2226702e-04, 2.8377571e-15,
        4.3501883e-13, 4.1252978e-12]], dtype=float32)

After rounding the prediction:

y_proba.round(2)

Output:

array([[0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 1.  ],
       [1.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  ],
       [0.98, 0.  , 0.  , 0.02, 0.  , 0.  , 0.  , 0.  , 0.  , 0.  ]],
      dtype=float32)

Check the predcition for first 3 values of X_valid:

for y in y_proba:
  print(np.argmax(y))

Output:

9
0
0

if you want to compare these values with y_valid(actual label):

y_valid[:3]

Output:

array([9, 0, 0], dtype=uint8)

You can refer this link to know more about plotting predicted images with it's prediction percentage.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 TFer2