'remove overlapped text watermark (2 watermark text content in one pdf file) from pdf using python

I'm trying to remove watermark from pdf files and use these pdf files for data extraction.

I tried the approach mentioned here 'https://stackoverflow.com/questions/66528514/how-to-remove-watermark-from-pdf-file-using-pythons-pypdf2-lib'. This works for most of the watermarks.

But there are couple of scenarios where the watermark is not getting removed through this code. like example - if the watermark is at -45 degree angle or if there are 2 watermarks like one in middle of page with 3 short line content and other is at 45' angle overlapping the middle watermark.

Is there a way where I can remove these watermark through code ?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source