'Poppler in path for pdf2image
I'm trying to use pdf2image and it seems I need something called propeller :
(sum_env) C:\Users\antoi\Documents\Programming\projects\summarizer>python ocr.py -i fr13_idf.pdf
Traceback (most recent call last):
File "c:\Users\antoi\Documents\Programming\projects\summarizer\sum_env\lib\site-packages\pdf2image\pdf2image.py", line 165, in __page_count
proc = Popen(["pdfinfo", pdf_path], stdout=PIPE, stderr=PIPE)
File "C:\Python37\lib\subprocess.py", line 769, in __init__
restore_signals, start_new_session)
File "C:\Python37\lib\subprocess.py", line 1172, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "ocr.py", line 53, in <module>
pdfspliterimager(image_path)
File "ocr.py", line 32, in pdfspliterimager
pages = convert_from_path("document-page%s.pdf" % i, 500)
File "c:\Users\antoi\Documents\Programming\projects\summarizer\sum_env\lib\site-packages\pdf2image\pdf2image.py", line 30, in convert_from_path
page_count = __page_count(pdf_path, userpw)
File "c:\Users\antoi\Documents\Programming\projects\summarizer\sum_env\lib\site-packages\pdf2image\pdf2image.py", line 169, in __page_count
raise Exception('Unable to get page count. Is poppler installed and in PATH?')
Exception: Unable to get page count. Is poppler installed and in PATH?
I tried this link but it the thing to download didn't solved my problem.
Solution 1:[1]
pdf2image is only a wrapper around poppler (not propeller!), to use the module you need to have poppler-utils installed on your machine and in your path.
The procedure is linked in the project's README in the "How to install" section.
Solution 2:[2]
1st of all Download Poppler from here here,Then extract it.In the code section just add poppler_path=r'C:\Program Files\poppler-0.68.0\bin'(for eg.) like below
from pdf2image import convert_from_path
images = convert_from_path("mypdf.pdf", 500,poppler_path=r'C:\Program Files\poppler-0.68.0\bin')
for i, image in enumerate(images):
fname = 'image'+str(i)+'.png'
image.save(fname, "PNG")
Now its done.With this trick no need to add Environmental Variables.Let me know if you have any problem.
Solution 3:[3]
These pdf2image and pdftotext library backend requierment is Poppler, so you have to install
'conda install -c conda-forge poppler '
then the error will be resolved. and if still it won't work for you then you can follow http://blog.alivate.com.au/poppler-windows/ to install this library.
Solution 4:[4]
It is poppler which is not installed properly. Using this you can get correct package for installation.
sudo apt-get install poppler-utils
Solution 5:[5]
For windows; to solve PDFInfoNotInstalledError: Unable to get page count. Is poppler installed and in PATH? :
- Install
chocolateyhttps://chocolatey.org/install - then install poppler using choco:
choco install poppler
Solution 6:[6]
Poppler in path for pdf2image
While working with pdf2image there are dependency that needs to be satisfied:
Installation of pdf2image
pip install pdf2image
Installation of python-dateutil
pip install python-dateutil
Installation of Poppler
Specifying Poppler path in environment variable (system path)
Installing Poppler on Windows
- Go to https://github.com/oschwartz10612/poppler-windows/releases/
- Under Release 21.11.0-0 Latest v21.11.0-0
- Go to Assets 3
- Download Release-21.11.0-0.zip
Adding Poppler to path
- Add Poppler installed to loaction :C:\Users\UserName\Downloads\Release-21.11.0-0.zip
- Add C:\Users\UserName\Downloads\Release-21.11.0-0.zip to system variable path in Environment Variable
Specifying poppler path in code
pages = convert_from_path(filepath, poppler_path=r"actualpoppler_path")
Solution 7:[7]
I'm working on a mac in Visual Studio Code and I encountered this error. I followed the install instructions and was able to verify the packages were installed but the error persisted when running in VSC.
Even though I had my python.condaPath and python.pythonPath specified in my settings.json it wasn't until activated the conda environment inside of the VSC integrated terminal itself
conda activate my_env
that the error went away..
Bizarre.
Solution 8:[8]
After downloading poppler do this.... import os os.environ["PATH"] = r"C:.....\poppler-xxxxxxx\bin" use this to make environment hope it works.It worked for me.
Solution 9:[9]
In Windows
Install the Poppler for Windows Poppler
500 = Quality of JPG
the path contains the pdf files
pip install pdf2img
path = r'C:\ABC\FEF\KLH\pdf_extractor\output\break' def spliting_pdf2img( path): from pdf2image import convert_from_path, convert_from_bytes for file in os.listdir(path): if file.lower().endswith(".pdf"): pages = convert_from_path(os.path.join(path,file), 500,poppler_path= r'C:\ABC\DEF\Downloads\poppler-0.68.0\bin') for page in pages: page.save(os.path.join(path,file.lower().replace(".pdf",".jpg")),'JPEG')
In Linux/UBUNTU Install the below packages in the ubuntu/linux terminal
sudo apt-get update
sudo apt-get install poppler-utils
path = r'C:\ABC\FEF\KLH\pdf_extractor\output\break' def spliting_pdf2img( path): from pdf2image import convert_from_path, convert_from_bytes for file in os.listdir(path): if file.lower().endswith(".pdf"): pages = convert_from_path(os.path.join(path,file), 500) for page in pages: page.save(os.path.join(path,file.lower().replace(".pdf",".jpg")),'JPEG')
Solution 10:[10]
I had the same problem on my Mac
I solved it by replacing the poppler_path from - poppler_path= '\usr\bin'
" to poppler_path= '\usr\local\bin'
but you can try to print all the places that poppler might be in your mac
by echo $PATH in the Terminal and try all the options as poppler_path=" "
Solution 11:[11]
I had the same issue on Mac using Visual Studio Code and a conda environment.
I found out that I could run the code from the command line, however not from VS code. I then printed the environment variables when running from the command line and in VS code using:
print(os.environ)
When I compared the two, I noticed that the "PATH" variable was different. My conda environment was not in the "PATH" variable in VS code. I think this means that VS code was not correctly activating my conda environment. I therefore took my "PATH" from the command line and set it in my launch.json environment variables. Then the problem was fixed.
"configurations": [
{
"name": "Python: Current File",
"type": "python",
"request": "launch",
"python": "/Users/<username>/miniconda3/envs/<env_name>/bin/python",
"env": {
"PATH":"<PATH STRING from command line>"
},
"program": "${file}"
}
Solution 12:[12]
If anyone still has this error on Windows, I solved the problem by:
- Download the Latest binary of Poppler for Windows from Poppler for Windows
- Unzip it into C drive like
C:\poppler-0.68.0 - Specify the Poppler path like this:
from PIL import Image
import pytesseract
import sys
from pdf2image import convert_from_path
import os
ROOT_DIR = os.path.abspath(os.curdir)
# Path of the pdf
PDF_file = ROOT_DIR + r"\PdfToImage\src\2.pdf"
'''
Part #1 : Converting PDF to images
'''
# Store all the pages of the PDF in a variable
pages = convert_from_path(PDF_file, 500, poppler_path=r'C:\poppler-0.68.0\bin')
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
