'how to show page number of difference in pdf text comparison html table?

This HTML table comparing two novels notes the detected differences with the 'n' but also lists the word count; instead, I would like it to list the page number from the two documents where this difference occurs.

import textract
import difflib
import nltk
import re[![enter image description here][1]][1]
text = textract.process('scarletfirsted.pdf', method = 'tesseract')
book1 = text.decode()
text = textract.process('scarletsecond.pdf', method = 'tesseract')
book2 = text.decode()
from nltk.tokenize import sent_tokenize, word_tokenize
from IPython.core.display import HTML
words1 = nltk.word_tokenize(book1)
words2 = nltk.word_tokenize(book2)
htmldiff = difflib.HtmlDiff()
tbl = htmldiff.make_table(words1, words2, context = True)
HTML(tbl) 


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source