'TFLite - Post Training (full integer) Quantization - problems with using a representative dataset

I have a hard time to get good results for a full integer quantized TFLite Model using Post-training quantization. The model does not recognize anything corectly. I used the given notebook tutorial from google and changed it. Here is my version where I try to perform full integer quantization by using images from the coco validation dataset. It should run completely on its own.

Probably something is wrong with _representative_dataset_gen() which looks like this:

def _representative_dataset_gen():
print("200 rep dataset function called!")
root = 'val2017/'
pattern = "*.jpg"
imagePaths = []
for path, subdirs, files in os.walk(root):
    for name in files:
        if fnmatch(name, pattern):
            imagePaths.append(root + name)        
for index,p in enumerate(imagePaths[:200]): 
        if index % 10 == 0:
          print(index)
        image = cv2.imread(p)
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) 
        image = cv2.resize(image, (640, 640)) 
        image = image.astype("float") 
        image = np.expand_dims(image, axis=1) 
        image = image.reshape(1, 640, 640, 3)
        yield [image.astype("float32")]
        #yield image

I also compared it to a full integer version which only gets one single image as a repr. dataset. Interestingly it performs really similar, therefore I am quite confident that my attempt is wrong.

Don't hesitate to ask questions. I would appreciate any help.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source