'Python - neural network training loss and validation loss values

I have a question about my training loss and validation loss for a neural network in python using pytorch. I am using bert to classify labels for some given text.

I have about 14k text records with 20 unique labels - where some labels are more frequent than others.

I use about 25% as my validation set and use strafification when performing train_test_splits.

My learning rate is 1e-6

attention_probs_dropout_prob=0.2

hidden_dropout_prob=0.2

There is no data leakeage as I did not impute any values

While training my model I notice few things.

  1. training loss remains higher than validation loss

  2. with each epoch both losses go down but training loss never goes below the validation loss even though they are close

  3. Example

epoch 8 epoch 50 epoch 100 epoch 200
training loss 2.9 1.18 .98 .75
validation loss 2.4 1.0 .67 .45
F1 score - weighted .55 .75 .86 .90
  1. As noticed we see that the training loss decreases a bit at first but then slows down, but validation loss keeps decreasing with bigger increments

can someone explain to me what is going on with how this model is learning? My understanding is that it is not performing well based on the training loss and validation loss values. Usually my values should be lower with the training loss below validation loss

any input is appreciated

Thank you



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source