'Loss exploding while training CNN despite small learning rate

I have been working with synthetically produced data which consists of samples of the shape 4x1745 and 2 labels each of which further can have 120 classes. The total number of combinations of possible classes comes out to 7140.

I have been successfully able to train Decision tree models on the data and was able to achieve a test accuracy of 20% and a train accuracy of 88%.

I have built a CNN model with the following layers

model = keras.Sequential()
model.add(Conv2D(16,kernel_size=(3,3), activation='elu'))
model.add(MaxPooling2D())
model.add(Conv2D(32,kernel_size=(3,3), activation='elu'))
model.add(MaxPooling2D())
model.add(Conv2D(64,kernel_size=(3,3), activation='elu'))
model.add(MaxPooling2D())
model.add(Flatten())
model.add(Dense(128,activation='elu'))
model.add(Dense(120,activation='softmax'))

I have compiled the model with adam optimizer with a learning of 0.0001 and categorical crossentropy as the loss function.

The problem I am facing is that the loss eventually explodes and keeps increasing exponentially with each epoch.

I have tried using different learning rates but they just delay the time before the loss explodes.
I changed the number of layers in the model, which didn't stop the loss from exploding.
I have even reshaped the samples into 119x60 thinking that maybe the CNN was unable to catch any patterns when the samples are so long, but it doesn't help.
I have also tried changing the activation functions and the batch sizes.
And finally I tried using an ANN as well which led to the same problem.

Any help is highly appreciated.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source