'How to convert PASCAL VOC to YOLO

I was trying to develop some way to convert annotations between formats, and it's quit hard to find information but here I have :

This one is PASCAL VOC

<width>800</width>
<height>450</height>
<depth>3</depth>
<bndbox>
    <xmin>474</xmin>
    <ymin>2</ymin>
    <xmax>726</xmax> <!-- shape_width = 252  -->
    <ymax>449</ymax> <!-- shape_height = 447 -->
</bndbox>

convert to YOLO darknet

2 0.750000 0.501111 0.315000 0.993333

note initial 2 it's a category



Solution 1:[1]

using some math: (also can be useful to COCO)

categ_index [(xmin + xmax) / 2 / image_width] [(ymin + ymax) / 2 / image_height] [(xmax - xmin) / image_width] [(ymax  - ymin) / image_height]

in js code

const categ_index = 2;

const { width: image_width, height: image_height } = {
  width: 800,
  height: 450,
};

const { xmin, ymin, xmax, ymax } = {
  xmin: 474,
  ymin: 2,
  xmax: 727,
  ymax: 449,
};

const x_coord = (xmin + xmax) / 2 / image_width;

const y_coord = (ymin + ymax) / 2 / image_height;

const shape_width = (xmax - xmin) / image_width;

const shape_height = (ymax - ymin) / image_height;

console.log(`${categ_index} ${x_coord.toFixed(7)} ${y_coord.toFixed(7)} ${shape_width.toFixed(7)} ${shape_height.toFixed(7)}`);

// output
// 2 0.7506250 0.5011111 0.3162500 0.9933333

Solution 2:[2]

My classmates and I have created a python package called PyLabel to help others with this task and other labelling tasks.

This would be the basic code to convert from voc to coco:

!pip install pylabel
from pylabel import importer
dataset = importer.ImportVOC(path=path_to_annotations)
dataset.exporter.ExportToYoloV5() 

You can find sample notebooks and the source code here https://github.com/pylabel-project/pylabel

Solution 3:[3]

I use the following snippet to convert Pascal_VOC to YOLO. Yolo uses normalized coordinates so it is important to have the height and width of your image. Otherwise you can't calculate it.

Here is my snippet:

# Convert Pascal_Voc bb to Yolo
def pascal_voc_to_yolo(x1, y1, x2, y2, image_w, image_h):
    return [((x2 + x1)/(2*image_w)), ((y2 + y1)//(2*image_h)), (x2 - x1)/image_w, (y2 - y1)/image_h]

I wrote an article about object detection format and how to convert them. You can check my blog post on Medium: https://christianbernecker.medium.com/convert-bounding-boxes-from-coco-to-pascal-voc-to-yolo-and-back-660dc6178742

Have fun!

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 alexheat
Solution 3 Christian Bernecker