'Web scraping - How can I extract only the new text of a webpage?
I want to extract the content of a webpage in which text is added regularly on a weekly basis. The difficulty is that there is no way to easily know precisely where content has been added on the webpage that is basically a database.
Let us say that I have already extracted all the content at a given time t.
I would love to have a tool that would, on t+1, only select the content that I had not extracted before.
I am totally new to web scraping techniques, so I have no idea of how to do that. I thank you in advance for your helpful tips!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
