'Quantized model gives negative accuracy after conversion from pytorch to ONNX

I'm trying to train a quantize model in pytorch and convert it to ONNX. I employ the quantized-aware-training technique with help of pytorch_quantization package. I used the below code to convert my model to ONNX:

from pytorch_quantization import nn as quant_nn
from pytorch_quantization import calib
from pytorch_quantization.tensor_quant import QuantDescriptor
from pytorch_quantization import quant_modules
import onnxruntime 
import torch
import torch.utils.data
from torch import nn
import torchvision

def export_onnx(model, onnx_filename, batch_onnx, per_channel_quantization):
    model.eval()
    quant_nn.TensorQuantizer.use_fb_fake_quant = True # We have to shift to pytorch's fake quant ops before exporting the model to ONNX

    if per_channel_quantization:
        opset_version = 13
    else:
        opset_version = 12

    # Export ONNX for multiple batch sizes
    print("Creating ONNX file: " + onnx_filename)
    dummy_input = torch.randn(batch_onnx, 3, 224, 224, device='cuda') #TODO: switch input dims by model
    input_names = ['input']
    output_names = ['Linear[fc]']  ### ResNet34
    dynamic_axes = {'input': {0: 'batch_size'}}

    try:
        torch.onnx.export(model, dummy_input, onnx_filename, input_names=input_names,
                          export_params=True, output_names=output_names, opset_version=opset_version,
                          verbose=True, enable_onnx_checker=False, do_constant_folding=True)

    except ValueError:
        warnings.warn(UserWarning("Per-channel quantization is not yet supported in Pytorch/ONNX RT (requires ONNX opset 13)"))
        print("Failed to export to ONNX")
        return False
    return True

After conversion, I get the following warnings:

warnings.warn("'enable_onnx_checker' is deprecated and ignored. It will be removed in " W0305 12:39:40.472136 140018114328384 tensor_quantizer.py:280] Use Pytorch's native experimental fake quantization.

/usr/local/lib/python3.8/dist-packages/pytorch_quantization/nn/modules/tensor_quantizer.py:285: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!

Also, the accuracy is not valid for ONNX model!

Accuracy summary:
+-----------+-------+
| Stage     |  Top1 |
+-----------+-------+
| Finetuned | 38.03 |
| ONNX      | -1.00 |
+-----------+-------+

More info is here:

pytorch 1.10.2+cu102
torchvision 0.11.3+cu102 
TensorRT  8.2.3-1+cuda11.4
ONNX 1.11.0
ONNX Runtime 1.10.0
cuda 11.6
python 3.8

What is the problem with ONNX conversion?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source