'What and where am I going wrong in this code for pytorch based object detection?

I am using Yolov5 for this project

Here is my code

import numpy as np
import cv2
import torch
import torch.backends.cudnn as cudnn
from models.experimental import attempt_load
from utils.general import non_max_suppression

weights = '/Users/nidhi/Desktop/yolov5/best.pt'
device = torch.device('cpu')

model = attempt_load(weights, map_location=device)  # load FP32 model
stride = int(model.stride.max())  # model stride
cudnn.benchmark = True

# Capture with opencv and detect object
cap = cv2.VideoCapture('Pothole testing.mp4')
width, height = (352, 352) # quality 
cap.set(3, width) # width
cap.set(4, height) # height

while(cap.isOpened()):
    time.sleep(0.2) # wait for 0.2 second 
    ret, frame = cap.read()
    if ret ==True:
        now = time.time()
        img = torch.from_numpy(frame).float().to(device).permute(2, 0, 1)
        img /= 255.0  # 0 - 255 to 0.0 - 1.0
        
        if img.ndimension() == 3:
            img = img.unsqueeze(0)

        pred = model(img, augment=False)[0]
        pred = non_max_suppression(pred, 0.39, 0.45, classes=0, agnostic=True) # img, conf, iou, classes, ...
        print('time -> ', time.time()-now)
     else:
         break

cap.release()

The error I am getting:

  File "run.py", line 38, in <module>
    pred = model(img, augment=False)[0]
  File "/Users/nidhi/Library/Python/3.8/lib/python/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/nidhi/Desktop/yolov5/models/yolo.py", line 118, in forward
    return self.forward_once(x, profile)  # single-scale inference, train
  File "/Users/nidhi/Desktop/yolov5/models/yolo.py", line 134, in forward_once
    x = m(x)  # run
  File "/Users/nidhi/Library/Python/3.8/lib/python/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/nidhi/Desktop/yolov5/models/common.py", line 152, in forward
    return torch.cat(x, self.d)
RuntimeError: Sizes of tensors must match except in dimension 1. Got 108 and 107 in dimension 3 (The offending index is 1)

Operating system: macOS Big Sur 11.2.3

Python version: 3.8.2

The model is used best.pt which I had trained on Google Colab, I used yolov5l model to train the dataset.



Solution 1:[1]

Are you getting your error in the following line?

pred = model(img, augment=False)[0]

It might be because YOLO expects inputs of the image size which are multiple of 32. So 320×320, 352×352 etc. But you are 352x288. You will either have to resize it, or pad the 288 dimension with white/black pixels to make it 352.

If you are not sure about where you are getting the error, can you attach the whole error?

Solution 2:[2]

Get the solution here https://www.youtube.com/watch?v=_gQ2Xzld0m4 Its work for me do exactly the same thinks from model=yolov5s.pt

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 stupid_cannon
Solution 2 Citoyen x14