'Quantized model gives negative accuracy after conversion from pytorch to ONNX
I'm trying to train a quantize model in pytorch and convert it to ONNX. I employ the quantized-aware-training technique with help of pytorch_quantization package. I used the below code to convert my model to ONNX:
from pytorch_quantization import nn as quant_nn
from pytorch_quantization import calib
from pytorch_quantization.tensor_quant import QuantDescriptor
from pytorch_quantization import quant_modules
import onnxruntime
import torch
import torch.utils.data
from torch import nn
import torchvision
def export_onnx(model, onnx_filename, batch_onnx, per_channel_quantization):
model.eval()
quant_nn.TensorQuantizer.use_fb_fake_quant = True # We have to shift to pytorch's fake quant ops before exporting the model to ONNX
if per_channel_quantization:
opset_version = 13
else:
opset_version = 12
# Export ONNX for multiple batch sizes
print("Creating ONNX file: " + onnx_filename)
dummy_input = torch.randn(batch_onnx, 3, 224, 224, device='cuda') #TODO: switch input dims by model
input_names = ['input']
output_names = ['Linear[fc]'] ### ResNet34
dynamic_axes = {'input': {0: 'batch_size'}}
try:
torch.onnx.export(model, dummy_input, onnx_filename, input_names=input_names,
export_params=True, output_names=output_names, opset_version=opset_version,
verbose=True, enable_onnx_checker=False, do_constant_folding=True)
except ValueError:
warnings.warn(UserWarning("Per-channel quantization is not yet supported in Pytorch/ONNX RT (requires ONNX opset 13)"))
print("Failed to export to ONNX")
return False
return True
After conversion, I get the following warnings:
warnings.warn("'enable_onnx_checker' is deprecated and ignored. It will be removed in " W0305 12:39:40.472136 140018114328384 tensor_quantizer.py:280] Use Pytorch's native experimental fake quantization.
/usr/local/lib/python3.8/dist-packages/pytorch_quantization/nn/modules/tensor_quantizer.py:285: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
Also, the accuracy is not valid for ONNX model!
Accuracy summary:
+-----------+-------+
| Stage | Top1 |
+-----------+-------+
| Finetuned | 38.03 |
| ONNX | -1.00 |
+-----------+-------+
More info is here:
pytorch 1.10.2+cu102
torchvision 0.11.3+cu102
TensorRT 8.2.3-1+cuda11.4
ONNX 1.11.0
ONNX Runtime 1.10.0
cuda 11.6
python 3.8
What is the problem with ONNX conversion?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
