'How to re-use sklearn.preprocessing's StandardScaler on a .tflite model in an Android application?
I've built a Neural Network model which I saved as a .tflite model for further integration in my Android application. I've successfully integrated it, but I just realized that I'm missing the input data scaling part, the part that was done in python code with help of sklearn's StandardScaler.
// Standardize features by removing the mean and scaling to unit variance.
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
Is there a way of saving the StandardScaler's parameters, so it can be used to process the input data in my android application?
float[] input = new float[]{0.1, 0.2, 3, 4.4, 6.1, 1.3};
float[][] output = new float[1][4];
// Need to standardize the input here
// before feeding it to the model
// Run decoding signature.
try (Interpreter interpreter = new Interpreter(loadModelFile())) {
Map<String, Object> inputs = new HashMap<>();
inputs.put("dense_6_input", input);
Map<String, Object> outputs = new HashMap<>();
outputs.put("dense_8", output);
interpreter.runSignature(inputs, outputs, "serving_default");
} catch (IOException e) {
e.printStackTrace();
}
I also saw that a sklearn.pipeline.Pipeline could be used to save the scaler within the model, but in this case I can't find an example or documentation of how to load and use it in java.
Solution:
If you want to use your model in a diffrent environment, but you have to standardize the input data, here is my solution:
As the documentation says, sklearn's StandardScaled is using the Z-Score Normalization z = (x - u) / s, where u is the mean of the dataset and s is the standard deviation. This means that we have to know the mean and the std of the dataset to be able to standardize the input in a diffrent environment.
To get the mean and the std:
# Standardize the data, so we won't have unexpected network behavior
scaler = StandardScaler(with_mean=True, with_std=True)
X_train_scaled = scaler.fit_transform(X_train)
print('Scaler mean attribute:')
scaler.mean_
print('Scaler std attribute:')
scaler.scale_
// Output
// The mean and std for my features
Scaler mean attribute:
array([-0.75648769, 4.12972816, 0.7942958 , 0.25645808, -0.10128877,
-0.9810976 ])
Scaler std attribute:
array([ 3.32718737, 3.14302739, 11.21207204, 0.18413352, 0.18150619,
0.09896419])
Now you're free to use it anywhere as in the following example:
private static final double[] FEATURES_MEAN = {-0.75648769, 4.12972816, 0.7942958 , 0.25645808, -0.10128877, -0.9810976};
private static final double[] FEATURES_STD = {3.32718737, 3.14302739, 11.21207204, 0.18413352, 0.18150619, 0.09896419};
//test
double[] input_test = new double[]{-3.862595, 1.480916, 3.381679, 0.349121, -0.159424, -0.910645};
// Standardize features by performing Z-Score Normalization
// New_Feature = (x – FEATURES_MEAN) / FEATURES_STD
float[] standardized_input = new float[input_test.length];
for(int i = 0; i < input_test.length; i++)
{
standardized_input[i] = (float) ((float)(input_test[i] - FEATURES_MEAN[i])/FEATURES_STD[i]);
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
