'Error while using categorical_crossentropy

I am learning deep learning with tensorflow. I made a simple NLP code predicting the next word on a given sentence

model = tf.keras.Sequential()
model.add(Embedding(num,64,input_length = max_len-1))   # we subtract 1 coz we cropped the laste word from X in out data
model.add(Bidirectional(LSTM(32)))
model.add(Dense(num,activation = 'softmax'))


model.compile(optimizer = 'adam',loss = 'categorical_crossentropy',metrics = ['accuracy'])

history = model.fit(X,Y,epochs = 500)

however using categorical_crossentropy gives me the following error

ValueError: You are passing a target array of shape (453, 1) while using as loss `categorical_crossentropy`. `categorical_crossentropy` expects targets to be binary matrices (1s and 0s) of shape (samples, classes). If your targets are integer classes, you can convert them to the expected format via:
```
from keras.utils import to_categorical
y_binary = to_categorical(y_int)
```

Alternatively, you can use the loss function `sparse_categorical_crossentropy` instead, which does expect integer targets.

Can someone explain me what does this mean and why i cant use categorical crossentropy loss function? Thank you so much! Any help would be appreciated!



Solution 1:[1]

Categorical cross entropy is used for multi-class classification problems. When you use "softmax" as an activation, there will be one node for each class in the output layer. For each sample, the node corresponding to the class of the sample should be close to one and the remaining nodes should be close to zero. Thus, true class labels Y needs to be an one-hot encoding vector.

Suppose your class labels in Y are integers like 0,1,2,... Please try the code below.

from keras.utils import to_categorical

model = tf.keras.Sequential()
model.add(Embedding(num,64,input_length = max_len-1))   # we subtract 1 coz we cropped the laste word from X in out data
model.add(Bidirectional(LSTM(32)))
model.add(Dense(num,activation = 'softmax'))


model.compile(optimizer = 'adam',loss = 'categorical_crossentropy',metrics = ['accuracy'])

Y_one_hot=to_categorical(Y) # convert Y into an one-hot vector
history = model.fit(X,Y_one_hot,epochs = 500)  # use Y_one_hot encoding instead of Y

Solution 2:[2]

For the provided answer (by Roohollah Etemadi) you have to import the to_categorical as follows:

from keras.utils.np_utils import to_categorical

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Su Silva