'Dataset Preparation for LSTM (multiple variables)

I am struggling to conceptualize the correct way to prepare a timeseries dataset for LSTM training. My main concern is how do I train the network to 'remember' N previous steps. I have two possible ways in my mind but I am not sure which one is the correct. I am really confused with this, I have tried both approaches (for 1 variable however) and they both seem to provide some plausible results.

1.) The dataset should be in a tensor format like this

 X1=[[var1_t_1, var2_t_1, var3_t_1],
     [var1_t_2, var2_t_2, var3_t_2],
      ...] 
    X.shape = [N, 3]

 y=[ [target_t_1],
     [target_t_2],
      ...]
    y.shape = [N, 1]

During training the LSTM gets N inputs, one for each timestep, and returns back N predictions that are used to compute the loss and update weights. The network on its own "creates memmory" about previous time step values through its cell states. But for how many previous steps can it create memmory, is there any way to define this memmory (if possible answer with pytorch example).

2.) The dataset should already contain the previous timestep values as features, so a 3rd dimension is neccessary eg.

X = [ [var1_t_1, var1_t_2,..., var1_t_10], [var2_t_1,..., var2_t_10],  [var3_t_1,..., var3_t_10],
      [var1_t_2, var1_t_3,..., var1_t_11], [var2_t_2,..., var2_t_11],  [var3_t_2,..., var3_t_11],
       ...] 
    X.shape = [N-10, 10, 3]

y = [ [target_t_11],
      [target_t_12],
       ... ]
    y.shape = [N-10, 1]

In this way we define the number of previous steps the LSTM should try to remember. For the example above we "ask" the LSTM to remember at least 10 previous prices in order to make predictions.

Any help to clarify the concept is greatly appreciated. Pytorch code would be extremely welcome as well.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Dataset Preparation for LSTM (multiple variables)

Sources

Related Questions