'PyPDF2, why am I getting an index error? List index out of range

I'm following along in Al Sweigart's book 'Automate the Boring Stuff' and I'm at a loss with an index error I'm getting. I'm working with PyPDF2 tring to open an encrypted PDF document. I know the book is from 2015 so I went to the PyPDF2.PdfFileReader docs to see if I'm missing anything and everything seems to be the same, at least from what I can tell. So I'm not sure what's wrong here.

My Code

import PyPDF2
reader = PyPDF2.PdfFileReader('encrypted.pdf')
reader.isEncrypted  # is True
reader.pages[0]

gives:

Traceback (most recent call last):
    File "<pyshell#65>", line 1, in <module>
pdfReader.getPage(0)
    File "/home/user67/.local/lib/python3.6/site-packages/PyPDF2/pdf.py", line 1176, in getPage
self._flatten()
    File "/home/user67/.local/lib/python3.6/site-packages/PyPDF2/pdf.py", line 1505, in _flatten
catalog = self.trailer["/Root"].getObject()
    File "/home/user67/.local/lib/python3.6/site-packages/PyPDF2/generic.py",    line 516, in __getitem__
return dict.__getitem__(self, key).getObject()
    File "/home/user67/.local/lib/python3.6/site-packages/PyPDF2/generic.py", line 178, in getObject
return self.pdf.getObject(self).getObject()
    File "/home/user67/.local/lib/python3.6/site-packages/PyPDF2/pdf.py", line 1617, in getObject
raise utils.PdfReadError("file has not been decrypted")
PyPDF2.utils.PdfReadError: file has not been decrypted
pdfReader.decrypt('rosebud')
1
pageObj = reader.getPage(0)
Traceback (most recent call last):
    File "<pyshell#67>", line 1, in <module>
pageObj = pdfReader.getPage(0)
    File "/home/user67/.local/lib/python3.6/site-packages/PyPDF2/pdf.py",line 1177, in getPage
return self.flattenedPages[pageNumber]
IndexError: list index out of range

Before asking my question, I did some searching on Google and found this link with a "proposed fix". However, I'm to new at this to see what the fix is. I can't make heads or tails out of this.



Solution 1:[1]

I figured it out. The issue is caused by running 'pdfReader.getPage(0)' before you decrypt the file in the IDLE shell. If you take that line out, or start over without using that line after getting the error it will work as it should.

Solution 2:[2]

Same error I got. I was working on console and before decrypt I used reader.getPage(0). Don't use getPage(#) / pages[#] before decrypt.

use code like below:

reader = PyPDF2.PdfFileReader("file.pdf")
# reader.pages[0]    # do not use this before decrypt
if reader.isEncrypted:
    reader.decrypt('')
reader.pages[0]

Solution 3:[3]

SELECT DISTINCT a, b, c, d FROM ... [WHERE ...]

Will find all the different combinations of a,b,c,d that exist in the table, optionally filtered by the Where clause.

It is unlikely that there is a faster way.

If you have PRIMARY KEY(a,b,c,d) then the combination is unique, not any given set. That is, these combinations could be valid PKs:

a,b,c,d
- - - -
1,1,1,1
1,1,1,2
3,2,1,1
1,2,3,4
7,7,7,7

A fast way to check whether (a,b,c,d) has no dups:

ALTER TABLE t ADD UNIQUE(a,b,c,d)

It will fail if there are any dups. (But it won't tell you anything else. And it will leave behind an extra INDEX if it succeeds.)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 User67
Solution 2 Martin Thoma
Solution 3 Rick James