'LSTM: Confusion Between the timesteps and features in the input_shape

I am using LSTM for credit card fraud detection. There are 21 features in the dataset. I do not know what does the number 9 mean in this case:

  X_train,X_test,y_train,y_test = train_test_split(X_s, y, test_size=0.3)
  print(X_train.shape)    
  print(X_test.shape)

This is the output:

  (398041, 9)
  (170589, 9)

  X_train = X_train.reshape((X_train.shape[0], 1, X_train.shape[1]))  
  X_test = X_test.reshape((X_test.shape[0], 1, X_test.shape[1]))

Also what does a timestep of 1 means in my case?

def model(input_shape):
model = keras.Sequential()
model.add(keras.layers.LSTM(50, input_shape=(1,9), return_sequences=True, recurrent_dropout=0.2))   #input_shape(num_timesteps, num_features)
model.add(keras.layers.LSTM(50))

model.add(keras.layers.Dense(50, activation='sigmoid'))
model.add(keras.layers.Dropout(0.3))
   
model.add(keras.layers.Dense(1, activation='sigmoid'))

return model

Thank you for the help in advance!

Solution 1:^[1]

There is a fundamental difference between the input structure of MLP and LSTM. In MLP, the inputs are 2 dimensional i.e., the first dimension is the number of samples while the second is the features. On the other hand, LSTM requires 3 dimensional inputs as (number of samples, timestamp, features). The time stamp is something like the time lag. It could be considered as the number of time that the LSTM cell is iterated. Suppose we have the following time series data set (inputs only) with two features and seven samples:

[X1 X2] = [3 42; 3 23; 23 32; 23 54; 32 23; 32 11; 3 17].

If we want to consider a time lag equal to 2 in MLP, then, the inputs are:

1- [3 42 3 23]2- [3 23 23 32]3- [23 32 23 54]4- [23 54 32 23]5- [32 23 32 11]6- [32 11 3 17]

So, the final data set shape is 6*4. But it changes to the following form in the LSTM:

1- [3 42] [3 23]2- [3 23] [23 32]3- [23 32] [23 54]4- [23 54] [32 23]5- [32 23] [32 11]6- [32 11] [3 17]

So, the final data set shape is 622.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	maakaan

'LSTM: Confusion Between the timesteps and features in the input_shape

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]