'How to find coordinates of any PDF object

I am generating PDF files from HTML ones using wkhtmltopdf. I would like to get the coordinates and size of an object in the PDF. For example, supose I add an image object in the HTML file with a tag like this: <img id="myimg" src="image.png" with="100" height="100"/>. I wonder whether is it possible to get the image's position and size in the generated PDF. Please, take into account that the object does not have to be an image. Whichever object mapped from the HTML file to the final PDF would do the trick.

I am using itext7 (v7.2.1, .NET version) to analyse the PDF file. So far, I have been able to read the PDF objects in the document, but I have not been successful in getting their identifiers. I would need to identify each object to filter the one I am looking for and get its coordinates afterwards.

private static async Task<int> AnalyzeFileAsync(string filePath)
{
            PdfDocument doc = new PdfDocument( new PdfReader(filePath) );
            //IEventListener listener = new 
            //PdfCanvasProcessor proc = new PdfCanvasProcessor();
            int objNum = doc.GetNumberOfPdfObjects();
            for ( int i = 1; i < objNum; i++ )
            {
                PdfObject obj = doc.GetPdfObject(i);
            }
}

I created this test PDF file with just one image inside it: blank.pdf. The original HTML text is this:

<html>
<head>
</head>
<body>
<img id="aaaaaaaaaaaaa" src="p.png" with="100" height="100"/>
</body>
</html>

I believed that the aaaaaaaaaaaaa identifier would be mapped into the PDF, but it seems that is not the case. Hence, I do not know how to insert and get a specific object in the PDF.

Could you please give me a hand with this?

Regards,



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source