'Poppler in path for pdf2image

I'm trying to use pdf2image and it seems I need something called propeller :

(sum_env) C:\Users\antoi\Documents\Programming\projects\summarizer>python ocr.py -i fr13_idf.pdf
Traceback (most recent call last):
  File "c:\Users\antoi\Documents\Programming\projects\summarizer\sum_env\lib\site-packages\pdf2image\pdf2image.py", line 165, in __page_count
    proc = Popen(["pdfinfo", pdf_path], stdout=PIPE, stderr=PIPE)
  File "C:\Python37\lib\subprocess.py", line 769, in __init__
    restore_signals, start_new_session)
  File "C:\Python37\lib\subprocess.py", line 1172, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "ocr.py", line 53, in <module>
    pdfspliterimager(image_path)
  File "ocr.py", line 32, in pdfspliterimager
    pages = convert_from_path("document-page%s.pdf" % i, 500)
  File "c:\Users\antoi\Documents\Programming\projects\summarizer\sum_env\lib\site-packages\pdf2image\pdf2image.py", line 30, in convert_from_path
    page_count = __page_count(pdf_path, userpw)
  File "c:\Users\antoi\Documents\Programming\projects\summarizer\sum_env\lib\site-packages\pdf2image\pdf2image.py", line 169, in __page_count
    raise Exception('Unable to get page count. Is poppler installed and in PATH?')
Exception: Unable to get page count. Is poppler installed and in PATH?

I tried this link but it the thing to download didn't solved my problem.



Solution 1:[1]

pdf2image is only a wrapper around poppler (not propeller!), to use the module you need to have poppler-utils installed on your machine and in your path.

The procedure is linked in the project's README in the "How to install" section.

Solution 2:[2]

1st of all Download Poppler from here here,Then extract it.In the code section just add poppler_path=r'C:\Program Files\poppler-0.68.0\bin'(for eg.) like below

from pdf2image import convert_from_path
images = convert_from_path("mypdf.pdf", 500,poppler_path=r'C:\Program Files\poppler-0.68.0\bin')
for i, image in enumerate(images):
    fname = 'image'+str(i)+'.png'
    image.save(fname, "PNG")

Now its done.With this trick no need to add Environmental Variables.Let me know if you have any problem.

Solution 3:[3]

These pdf2image and pdftotext library backend requierment is Poppler, so you have to install

'conda install -c conda-forge poppler '

then the error will be resolved. and if still it won't work for you then you can follow http://blog.alivate.com.au/poppler-windows/ to install this library.

Solution 4:[4]

It is poppler which is not installed properly. Using this you can get correct package for installation.

sudo apt-get install poppler-utils

Solution 5:[5]

For windows; to solve PDFInfoNotInstalledError: Unable to get page count. Is poppler installed and in PATH? :

Solution 6:[6]

Poppler in path for pdf2image

While working with pdf2image there are dependency that needs to be satisfied:

  1. Installation of pdf2image

    pip install pdf2image

  2. Installation of python-dateutil

    pip install python-dateutil

  3. Installation of Poppler

  4. Specifying Poppler path in environment variable (system path)

Installing Poppler on Windows

Adding Poppler to path

  • Add Poppler installed to loaction :C:\Users\UserName\Downloads\Release-21.11.0-0.zip
  • Add C:\Users\UserName\Downloads\Release-21.11.0-0.zip to system variable path in Environment Variable

Specifying poppler path in code

pages = convert_from_path(filepath, poppler_path=r"actualpoppler_path")

Solution 7:[7]

I'm working on a mac in Visual Studio Code and I encountered this error. I followed the install instructions and was able to verify the packages were installed but the error persisted when running in VSC.

Even though I had my python.condaPath and python.pythonPath specified in my settings.json it wasn't until activated the conda environment inside of the VSC integrated terminal itself

conda activate my_env

that the error went away..

Bizarre.

Solution 8:[8]

After downloading poppler do this.... import os os.environ["PATH"] = r"C:.....\poppler-xxxxxxx\bin" use this to make environment hope it works.It worked for me.

Solution 9:[9]

In Windows

Install the Poppler for Windows Poppler

  • 500 = Quality of JPG

  • the path contains the pdf files

  • pip install pdf2img

     path = r'C:\ABC\FEF\KLH\pdf_extractor\output\break'
    
     def spliting_pdf2img( path):
         from pdf2image import convert_from_path, convert_from_bytes
         for file in os.listdir(path):
             if file.lower().endswith(".pdf"):
                 pages = convert_from_path(os.path.join(path,file), 500,poppler_path= r'C:\ABC\DEF\Downloads\poppler-0.68.0\bin')
                 for page in pages:                    
                     page.save(os.path.join(path,file.lower().replace(".pdf",".jpg")),'JPEG')    
    

In Linux/UBUNTU Install the below packages in the ubuntu/linux terminal

  • sudo apt-get update

  • sudo apt-get install poppler-utils

    path = r'C:\ABC\FEF\KLH\pdf_extractor\output\break'
    
     def spliting_pdf2img( path):
         from pdf2image import convert_from_path, convert_from_bytes
         for file in os.listdir(path):
             if file.lower().endswith(".pdf"):
                 pages = convert_from_path(os.path.join(path,file), 500)
                 for page in pages:                    
                     page.save(os.path.join(path,file.lower().replace(".pdf",".jpg")),'JPEG')
    

Solution 10:[10]

I had the same problem on my Mac
I solved it by replacing the poppler_path from - poppler_path= '\usr\bin' " to poppler_path= '\usr\local\bin' but you can try to print all the places that poppler might be in your mac by echo $PATH in the Terminal and try all the options as poppler_path=" "

Solution 11:[11]

I had the same issue on Mac using Visual Studio Code and a conda environment.

I found out that I could run the code from the command line, however not from VS code. I then printed the environment variables when running from the command line and in VS code using:

print(os.environ)

When I compared the two, I noticed that the "PATH" variable was different. My conda environment was not in the "PATH" variable in VS code. I think this means that VS code was not correctly activating my conda environment. I therefore took my "PATH" from the command line and set it in my launch.json environment variables. Then the problem was fixed.

"configurations": [
        {
            "name": "Python: Current File",
            "type": "python",
            "request": "launch",
            "python": "/Users/<username>/miniconda3/envs/<env_name>/bin/python",
            "env": {
                "PATH":"<PATH STRING from command line>"
            },
            "program": "${file}"
        }

Solution 12:[12]

If anyone still has this error on Windows, I solved the problem by:

  • Download the Latest binary of Poppler for Windows from Poppler for Windows
  • Unzip it into C drive like C:\poppler-0.68.0
  • Specify the Poppler path like this:
from PIL import Image
import pytesseract
import sys
from pdf2image import convert_from_path
import os

ROOT_DIR = os.path.abspath(os.curdir)

# Path of the pdf 
PDF_file = ROOT_DIR + r"\PdfToImage\src\2.pdf"
  
''' 
Part #1 : Converting PDF to images 
'''
  
# Store all the pages of the PDF in a variable 
pages = convert_from_path(PDF_file, 500, poppler_path=r'C:\poppler-0.68.0\bin')