'Tflite Inference On Aarch64 is very slow
I converted pb model to tflite using this script
# float32
def tflite_convert_float32(input_array, output_array, pb_path, tflite_path):
converter = tf.lite.TFLiteConverter.from_frozen_graph(pb_path,
input_arrays=input_array,
output_arrays=output_array
)
tfmodel = converter.convert()
open (tflite_path , "wb").write(tfmodel)
Then I run inference script for this converted tflite model. But it is giving slower performance than actual tensorflow pb inference on same platform.
The pb inference time average over 100 frames: 0.15 fps
The tflite inference average over 100 frames: 0.13 fps
I heard tflite will give very good performance on aarch64. But why it is slower than pb inference? I have to add something more while conversion of pb to tflite? Am I missing something?
lscpu output:
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: ARM
Model: 4
Model name: Cortex-A53
Stepping: r0p4
BogoMIPS: 16.00
L1d cache: unknown size
L1i cache: unknown size
L2 cache: unknown size
NUMA node0 CPU(s): 0-3
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
`` `
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
