'Removing text enclosed between HTML tags using JSoup
In some cases of HTML cleaning, I would like to retain the text enclosed between the tags(which is the default behaviour of Jsoup) and in some cases, I would like to remove the text as well as the HTML tags. Can someone please throw some light on how I can remove the text enclosed between the HTML tags using Jsoup?
Solution 1:[1]
1. String html = "<!DOCTYPE html><html><head><title></title></head><body><p>hello there</p></body></html>";
2. Document d = Jsoup.parse(html);
3. System.out.println(d);
4. System.out.println("************************************************");
5. d.getElementsByTag("p").remove();
6. System.out.println(d);
while you getting with Elements you getting some trouble you can do this action on Document d object. that will work accurate.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Himanshu |
