'How to get accurate execution time of a .tflite model?
I have a pre-trained model (classification) in TF2.X (.pb or .h5). I can do prediction and get execution time of each predict (inference part). Now, I have converted it to a .tflite model. I can test it (in tense of accuracy) using this code:
# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path=quantized_tflite_model)
interpreter.allocate_tensors()
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
# Test model on some input data.
input_shape = input_details[0]['shape']
acc = 0
for i in range(len(X_Test)):
input_data = X_Test[i].reshape(input_shape)
# input_data = input_data.astype(np.int8)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
if (np.argmax(output_data) == np.argmax(Y_Test[i])):
acc += 1
acc = acc / len(X_Test)
print(acc * 100)
But I don't know how can I measure execution time of this model (.tflite) on my system.
I get wrong time when I try to measure time before interpreter.set_tensor(input_details[0]['index'], input_data) and after output_data = interpreter.get_tensor(output_details[0]['index']). I think wrong time is for interpreter.invoke() (it takes a lot of time).
How can I measure accurate time of a .tflite model on my system?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
