'BeautifulSoup tried to find by text but returned nothing?
The program was intended to search through a table on the web.
single_paper_soup.find('dl').find_all("dt")
returns:
[<dt>Volume:</dt>,
<dt>Month:</dt>,
<dt>Year:</dt>,
<dt>Venues:</dt>,
<dt>Pages:</dt>,
<dt>Language:</dt>,
<dt>URL:</dt>,
<dt>DOI:</dt>,]
However, when I dived into the content by searching text:
single_paper_soup.find('dl').find_all("dt",string = "Year")
it returned nothing:
[]
Both string and text methods returned nothing.
Is there anything wrong with the code?
Solution 1:[1]
Searching for a string per string or text needs exact string to match in your case:
soup.find_all("dt",string = "Year:")
There are other options to search for a tag that conatins a string / substring,
Use css selectors and :-soup-contains():
soup.select('dt:-soup-contains("Year")')
Import re and use re.compile() to find your tag with text:
soup.find_all("dt", text=re.compile("Year"))
Example
import requests
import re
from bs4 import BeautifulSoup
html='''
<dt>Volume:</dt>
<dt>Month:</dt>
<dt>Year:</dt>
<dt>Venues:</dt>
<dt>Pages:</dt>
<dt>Language:</dt>
<dt>URL:</dt>
<dt>DOI:</dt>
'''
soup = BeautifulSoup(html)
soup.select('dt:-soup-contains("Year")')
soup.find_all("dt",string = "Year:")
soup.find_all("dt", text=re.compile("Year"))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | HedgeHog |
