'Scaling or Normalizing Data gives worse results (Already checked Implementation)

I am trying to optimize my model with optuna and was looking for a problem why my model always is around 0.5 Loss. So I realized that my normalization makes my results worse than without it. I checked my implementation in a seperate script to be sure I implemented it right. Then I also tried a standardization but that got me also worse results.

I am training an LSTM on timeseries data. I am trying to build a classifier based on timeseries of 3 different classes. Each Class has sequences of 190 timestamps.

1. Training without norm. or stand.

enter image description here

2. With Min Max Scaler

enter image description here

3. With normilization

enter image description here

Implementation of normalize

So I checked my implementation based on this simple skript.

from sklearn.preprocessing import MinMaxScaler, normalize
import pandas as pd

x = [[1, -1, 2], [2, 0, 0], [0, 1, -1]]
x = pd.DataFrame(x)

print(x)
>>>   
0  1  2
0  1 -1  2
1  2  0  0
2  0  1 -1
x_1 = normalize(x, axis=0, norm='max')

print(x_1)

>>>
[[ 0.5 -1.   1. ]
 [ 1.   0.   0. ]
 [ 0.   1.  -0.5]]

Implementation of Scaler

from sklearn.preprocessing import MinMaxScaler, normalize
import pandas as pd

x = [[1, -1, 2], [2, 0, 0], [0, 1, -1]]
x = pd.DataFrame(x)

print(x)
>>>
   0  1  2
0  1 -1  2
1  2  0  0
2  0  1 -1
scaler = MinMaxScaler(feature_range=(-1,1))
x_1 = scaler.fit_transform(x)

print(x_1)
>>>
[[ 0.         -1.          1.        ]
 [ 1.          0.         -0.33333333]
 [-1.          1.         -1.        ]]

Real Implementation in my script

I have my data stored in one big dataframe called mdf_import. Each line is a timestamp and at the end there is column with a index based of on which sequences this timestamp is from and a column with a label. Here I seperate the sequences based on their index and store them in a tuple with their label.

for Index, group in mdf_import.groupby("Index"):
    sequence_features = group[labellist[1]]
    print(sequence_features)
    
    #scaler = MinMaxScaler(feature_range=(-1,1))
    #sequence_features = scaler.fit_transform(sequence_features)
    sequence_features = normalize(sequence_features, axis=0, norm='max')
    
    label = labellist[0][labellist[0].Index == Index].iloc[0].enc_label
    sequences.append((sequence_features, label))





Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source