'BeautifulSoup tried to find by text but returned nothing?

The program was intended to search through a table on the web.

single_paper_soup.find('dl').find_all("dt")

returns:

[<dt>Volume:</dt>,
 <dt>Month:</dt>,
 <dt>Year:</dt>,
 <dt>Venues:</dt>,
 <dt>Pages:</dt>,
 <dt>Language:</dt>,
 <dt>URL:</dt>,
 <dt>DOI:</dt>,]

However, when I dived into the content by searching text:

single_paper_soup.find('dl').find_all("dt",string = "Year") 

it returned nothing:

[]

Both string and text methods returned nothing.

Is there anything wrong with the code?



Solution 1:[1]

Searching for a string per string or text needs exact string to match in your case:

soup.find_all("dt",string = "Year:") 

There are other options to search for a tag that conatins a string / substring,

Use css selectors and :-soup-contains():

soup.select('dt:-soup-contains("Year")')

Import re and use re.compile() to find your tag with text:

soup.find_all("dt", text=re.compile("Year")) 

Example

import requests
import re
from bs4 import BeautifulSoup

html='''
<dt>Volume:</dt>
 <dt>Month:</dt>
 <dt>Year:</dt>
 <dt>Venues:</dt>
 <dt>Pages:</dt>
 <dt>Language:</dt>
 <dt>URL:</dt>
 <dt>DOI:</dt>
'''

soup = BeautifulSoup(html)

soup.select('dt:-soup-contains("Year")')
soup.find_all("dt",string = "Year:")
soup.find_all("dt", text=re.compile("Year"))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 HedgeHog