'Ho to correctly use torchvision.ops.generalized_box_iou_loss?

I’m training a model to fit bounding boxes on images. All Bounding boxes are defined by two coordinates (x1,y1,x2,y2). To fit these bounding boxes I first used mse_loss. The loss converges, but the results are still not great enough. I therefore tried to use generalized_box_iou_loss with reduction='mean' (to have a Scalar for back-propagation). My bounding boxes satisfy the requirements 0 <= x1 < x2 and 0 <= y1 < y2. However, the loss is only approaching 1.

The final layer of the model looks like this.

self.fc3 = nn.Linear(256, 4)
...
# in forward:
x = self.fc3(x)

And the training step is defined as follows:

def training_step(self, batch, batch_idx):
    x, y = batch
    y_hat = self.model(x_aug) 
    loss = generalized_box_iou_loss(y_hat, y, reduction='mean')
    self.log("train_loss", loss)
    return loss

Any ideas how to resolve this issue? Best, Jona



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source