'OCRMYPDF: 'pages' parameter not working as expected even with optimization disabled
I'm using ocrmypdf and I just want the first page of the files to have their characters recognized. I'm trying to do this with
ocrmypdf -l por --force-ocr --pages 1 --optimize 0 input.pdf output.pdf
but even then it outputs
Start processing 10 pages concurrently
the files are in portuguese and some of them have text with fonts that I can't read in python because the string becomes a lot of "(cid:)" that's why I use --force-ocr.
Also I have a lot of files (the files are actually a parameter for an application I'm developing), so this is taking too much time.
My operating system is Windows if it helps somehow.
Solution 1:[1]
When maven "translates" your source code to a package, it changes the folder structure.
In a jar packaging:
src/main/javasources compiled go to jar's root (keeping java packages as a folder structure)src/main/resourcesgo to jar's root too.
So your file, once the jar is packaged, is in the root of the archive. Actually jar files are just zip files with a different extension, so you can use any zip manager to open it and explore it.
And to access the file do it exacrly as you are doing it, loading it as a resource from the class loader of the jar. Any class from your jar will do, as it delegates this to its class loader. Just change the path:
InputStream is = Main.class.getResourceAsStream("/credentials.json");
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
