'Scaling or Normalizing Data gives worse results (Already checked Implementation)
I am trying to optimize my model with optuna and was looking for a problem why my model always is around 0.5 Loss. So I realized that my normalization makes my results worse than without it. I checked my implementation in a seperate script to be sure I implemented it right. Then I also tried a standardization but that got me also worse results.
I am training an LSTM on timeseries data. I am trying to build a classifier based on timeseries of 3 different classes. Each Class has sequences of 190 timestamps.
1. Training without norm. or stand.
2. With Min Max Scaler
3. With normilization
Implementation of normalize
So I checked my implementation based on this simple skript.
from sklearn.preprocessing import MinMaxScaler, normalize
import pandas as pd
x = [[1, -1, 2], [2, 0, 0], [0, 1, -1]]
x = pd.DataFrame(x)
print(x)
>>>
0 1 2
0 1 -1 2
1 2 0 0
2 0 1 -1
x_1 = normalize(x, axis=0, norm='max')
print(x_1)
>>>
[[ 0.5 -1. 1. ]
[ 1. 0. 0. ]
[ 0. 1. -0.5]]
Implementation of Scaler
from sklearn.preprocessing import MinMaxScaler, normalize
import pandas as pd
x = [[1, -1, 2], [2, 0, 0], [0, 1, -1]]
x = pd.DataFrame(x)
print(x)
>>>
0 1 2
0 1 -1 2
1 2 0 0
2 0 1 -1
scaler = MinMaxScaler(feature_range=(-1,1))
x_1 = scaler.fit_transform(x)
print(x_1)
>>>
[[ 0. -1. 1. ]
[ 1. 0. -0.33333333]
[-1. 1. -1. ]]
Real Implementation in my script
I have my data stored in one big dataframe called mdf_import. Each line is a timestamp and at the end there is column with a index based of on which sequences this timestamp is from and a column with a label. Here I seperate the sequences based on their index and store them in a tuple with their label.
for Index, group in mdf_import.groupby("Index"):
sequence_features = group[labellist[1]]
print(sequence_features)
#scaler = MinMaxScaler(feature_range=(-1,1))
#sequence_features = scaler.fit_transform(sequence_features)
sequence_features = normalize(sequence_features, axis=0, norm='max')
label = labellist[0][labellist[0].Index == Index].iloc[0].enc_label
sequences.append((sequence_features, label))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|



