'OCR from Slightly Different PDFs

I am working on a project where i have to extract information from PDF documents. While the documents follow similar format, few documents are slightly different in their format, how do i handle this using python.

I am working with form 483 available on the FDA website.

Site Link

I want to extract employee information mentioned at the bottom of the page. The format of document varies slightly. How can I extract information.

Example Documents:

https://www.fda.gov/media/101442/download

https://www.fda.gov/media/135387/download

https://www.fda.gov/media/89200/download

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'OCR from Slightly Different PDFs

Sources

Related Questions