'OCR from Slightly Different PDFs
I am working on a project where i have to extract information from PDF documents. While the documents follow similar format, few documents are slightly different in their format, how do i handle this using python.
I am working with form 483 available on the FDA website.
I want to extract employee information mentioned at the bottom of the page. The format of document varies slightly. How can I extract information.
Example Documents:
https://www.fda.gov/media/101442/download
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
