'iText find position of text in pdf

I am creating a utility that will add a multi-line text field just below the last line of text in an existing PDF document. This will be used for people who want to add comments to a report that is generated from another system.

I followed the examples from the iText book, and also looked at this SO question: Get the exact Stringposition in PDF

So now I've got a method that parses the last page of the document and a custom listener that finds the coordinates of the text I'm looking for.

Here is my code parsing method:

private void parsePdfPage2(String src, int pageNum) throws IOException {
    PdfReader reader = new PdfReader(src);

    RenderListener listener = new MyTextRenderListener();
    PdfContentStreamProcessor processor = new PdfContentStreamProcessor(listener);
    PdfDictionary pageDic = reader.getPageN(pageNum);
    PdfDictionary resourcesDic = pageDic.getAsDict(PdfName.RESOURCES);
    processor.processContent(ContentByteUtils.getContentBytesForPage(reader, pageNum), resourcesDic);
}

And here is the listener:

public class MyTextRenderListener implements RenderListener {
@Override
public void beginTextBlock() {}

@Override
public void endTextBlock() {}

@Override
public void renderImage(ImageRenderInfo renderInfo) {}

@Override
public void renderText(TextRenderInfo renderInfo) {
    // Check if this is the text marker
    String text = renderInfo.getText();
    if (text.equalsIgnoreCase("Comments")) {
        // Found it
        LineSegment ls = renderInfo.getBaseline(); 
        System.out.println("Found at X: " + ls.getBoundingRectange().getX() +
                ",  Y: " + ls.getBoundingRectange().getY());
    }
}
}

However, now I need to send the found LineSegment object (or the individual coordinates) back to the original parsing method. Of course I could write the values to disk and read it in the parsing method, but that seems horrible. I'm pretty sure there is a more elegant way to achieve this and would appreciate pointers.



Solution 1:[1]

This is an old question but still: You could save the results of the listener into a private list (in the class). After that what's left to add is a public method (getResults()) which returns the list.

Simply call getResults after the processContent call since the synchronous nature of the processContent method guarentees the list to be filled correctly.

The only problem with this solution is that listeners can't be reused.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Gonnen Daube