'Update a fillable pdf using PyPDF2

I'm having trouble updating named fields in a fillable pdf. My code is as shown:

from PyPDF2 import PdfFileReader, PdfFileWriter

reader = PdfFileReader("invoice_template.pdf")
page = reader.getPage(0)

data_dict = {
    "business_name_1": "Consulting",
    "customer_name": "company.io",
    "customer_email": "[email protected]",
}

writer = PdfFileWriter()
writer.updatePageFormFieldValues(page, fields=data_dict)
writer.addPage(page)

with open("newfile.pdf", "wb") as fh:
    writer.write(fh)

I have checked the fields dictionary using myfile.getFormTextFields() before and after calling updatePageFormFieldValues() and they do get updated. However the generated pdf has none of the field values in it. Not sure what I'm doing wrong. The pdf I'm using can be found here



Solution 1:[1]

The problem is fixed by setting the NeedAppearances value of the PDF to True. This can be done by a function:

def set_need_appearances_writer(writer: PdfFileWriter):
    # See 12.7.2 and 7.7.2 for more information: http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf
    try:
        catalog = writer._root_object
        # get the AcroForm tree
        if "/AcroForm" not in catalog:
            writer._root_object.update({
                NameObject("/AcroForm"): IndirectObject(len(writer._objects), 0, writer)
            })

        need_appearances = NameObject("/NeedAppearances")
        writer._root_object["/AcroForm"][need_appearances] = BooleanObject(True)
        # del writer._root_object["/AcroForm"]['NeedAppearances']
        return writer

    except Exception as e:
        print('set_need_appearances_writer() catch : ', repr(e))
        return writer

Then, you can just add the line set_need_appearances_writer(writer) after the line writer = PdfFileWriter() and the form should be updated!

You can view more information here: https://github.com/mstamy2/PyPDF2/issues/355

Fixed code

from PyPDF2 import PdfFileWriter, PdfFileReader
from PyPDF2.generic import BooleanObject, NameObject, IndirectObject

def set_need_appearances_writer(writer: PdfFileWriter):
    # See 12.7.2 and 7.7.2 for more information: http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf
    try:
        catalog = writer._root_object
        # get the AcroForm tree
        if "/AcroForm" not in catalog:
            writer._root_object.update({
                NameObject("/AcroForm"): IndirectObject(len(writer._objects), 0, writer)
            })

        need_appearances = NameObject("/NeedAppearances")
        writer._root_object["/AcroForm"][need_appearances] = BooleanObject(True)
        # del writer._root_object["/AcroForm"]['NeedAppearances']
        return writer

    except Exception as e:
        print('set_need_appearances_writer() catch : ', repr(e))
        return writer

myfile = PdfFileReader("invoice_template.pdf")
first_page = myfile.getPage(0)

writer = PdfFileWriter()
set_need_appearances_writer(writer)

data_dict = {
            'business_name_1': 'Consulting',
            'customer_name': 'company.io',
            'customer_email': '[email protected]'
            }

writer.updatePageFormFieldValues(first_page, fields=data_dict)
writer.addPage(first_page)

with open("newfile.pdf","wb") as new:
    writer.write(new)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 West