'Question about a time-series prediction LSTM with attention mechanism

I am working with time-series prediction with a simple LSTM model, I want to improve performance of my model, so I wonder how to add attention mechanism to my model. Here are codes of my model,

class RNN_LSTM(nn.Module):
def __init__(self, input_size, hidden_size, num_layers, num_classes):
    super(RNN_LSTM, self).__init__()
    self.hidden_size = hidden_size
    self.num_layers = num_layers
    self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True)
    self.fc = nn.Linear(hidden_size * sequence_length, num_classes)
    self.dropout = nn.Dropout(p = drop_rate)

def forward(self, x):
    # Set initial hidden and cell states
    h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(device)
    c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(device)

    # Forward propagate LSTM, out = [batch_size, seq_len, hidden_size]
    out, _ = self.lstm(
        x, (h0, c0)
    )  
    
    # out: tensor of shape (batch_size, seq_length, hidden_size)
    out = out.reshape(out.shape[0], -1)

    # Decode the hidden state of the last time step
    out = self.fc(out)
    # out = out[:,-1,:].reshape(-1,1,144)
    return out

I would be greatly thankful if you can provide any useful advise.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source