'PyTorch LSTM model for next word prediction stops increasing at a specific accuracy but high loss continues to go down

I'm fairly new to this, so I apologize for any wrong terminology.

I have implemented a LSTM model for next word prediction using PyTorch. The input data is one-hot encoded word sequences, with the X features being an encoded sequence of previous words and the Y being an encoded next word. There are 56635 sequences with 6258 unique words and the sequence length is set to 2, therefore input feature vectors are 2x6258 and output vector is 1x6258.

Over 10 epochs of training, in the first few epochs accuracy increases and loss (admittedly high, which maybe someone can help me out with) decreases. After a few epochs, accuracy stops improving at a certain percentage (even with separate runs this number is the same), but loss continues to slowly decrease. What's going on?

Here is the model I'm using, where hidden_size = 128 and num_layers = 2:

class LSTM_NN(nn.Module):
def __init__(self, vocab_size, seq_len, hidden_size, num_layers):
    super(LSTM_NN, self).__init__()
    # Set attributes.
    self.vocab_size = vocab_size
    self.seq_len = seq_len
    self.hidden_size = hidden_size
    self.num_layers = num_layers
    # Define LSTM function.
    self.LSTM = nn.LSTM(self.vocab_size, self.hidden_size, self.num_layers, batch_first=True)
    # Define LSTM stack.
    self.post_lstm_stack = nn.Sequential(
        nn.Linear(self.hidden_size, self.hidden_size),
        nn.Linear(self.hidden_size, vocab_size)
    )
def forward(self, x):
    h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(device)
    c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(device)
    out, _ = self.LSTM(x, (h0, c0))
    out = out[:, -1, :]
    logits = self.post_lstm_stack(out)
    return logits

And here is the output I'm getting (only testing results is included):

Epoch 1
-------------------------------
Test Error: 
 Accuracy: 0.2%, Avg. loss: 8.722713 

Epoch 2
-------------------------------
Test Error: 
 Accuracy: 3.3%, Avg. loss: 8.709302 

Epoch 3
-------------------------------
Test Error: 
 Accuracy: 6.5%, Avg. loss: 8.695486 

Epoch 4
-------------------------------
Test Error: 
 Accuracy: 15.2%, Avg. loss: 8.681011 

Epoch 5
-------------------------------
Test Error: 
 Accuracy: 15.2%, Avg. loss: 8.665595 

Epoch 6
-------------------------------
Test Error: 
 Accuracy: 15.2%, Avg. loss: 8.648913 

Epoch 7
-------------------------------
 Test Error: 
 Accuracy: 15.2%, Avg. loss: 8.630579 

Epoch 8
-------------------------------
Test Error: 
 Accuracy: 15.2%, Avg. loss: 8.610118 

Epoch 9
-------------------------------
Test Error: 
 Accuracy: 15.2%, Avg. loss: 8.586936 

Epoch 10
-------------------------------
Test Error: 
 Accuracy: 15.2%, Avg. loss: 8.560268

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'PyTorch LSTM model for next word prediction stops increasing at a specific accuracy but high loss continues to go down

Sources

Related Questions