'beautiful soup - get tag desired text

Very new to beautiful soup. I'm attempting to get the text between tags.

databs.txt

<p>$343,343</p><h3>Single</h3><p class=3D'highlight-price' style=3D"margin: 0; font-family: 'Montserrat', sans-serif; text-decoration: none; color: #323232; font-weight: 500; font-size: 16px; line-height: 1.38;">$101,900</p><h3 class=3D"highlight-title" style=3D"margin: 0; margin-bottom: 6px; font-family: 'Montserrat', sans-serif; text-decoration: none; color: #323232; font-weight: 500; font-size: 13px; line-height: 1.45;">Multi</h3><p class=3D'highlight-price' style=3D"margin: 0; font-family: 'Montserrat', sans-serif; text-decoration: none; color: #323232; font-weight: 500; font-size: 16px; line-height: 1.38;">$201,900</p><h3 class=3D"highlight-title" style=3D"margin: 0; margin-bottom: 6px; font-family: 'Montserrat', sans-serif; text-decoration: none; color: #323232; font-weight: 500; font-size: 13px; line-height: 1.45;">Single</h3>

Python

#!/usr/bin/python
import os
from bs4 import BeautifulSoup

f = open(os.path.join("databs.txt"), "r")
text = f.read()
soup = BeautifulSoup(text, 'html.parser')


page1 = soup.find('p').getText()
print("P1:",page1)
page2 = soup.find('h3').getText()
print("H3:",page2)

Question:

How do I get the text "$101,900, Multi, $201,900, Single"?

python beautifulsoup

Solution 1:^[1]

If you want to get the tags that have attributes, you can use lambda function to get them as follows:

from bs4 import BeautifulSoup

html = """
<p>$343,343</p>
<h3>Single</h3>
<p class=3D'highlight-price' style=3D"margin: 0; font-family: 'Montserrat', sans-serif; text-decoration: none; color: #323232; font-weight: 500; font-size: 16px; line-height: 1.38;">$101,900</p><h3 class=3D"highlight-title" style=3D"margin: 0; margin-bottom: 6px; font-family: 'Montserrat', sans-serif; text-decoration: none; color: #323232; font-weight: 500; font-size: 13px; line-height: 1.45;">Multi</h3><p class=3D'highlight-price' style=3D"margin: 0; font-family: 'Montserrat', sans-serif; text-decoration: none; color: #323232; font-weight: 500; font-size: 16px; line-height: 1.38;">$201,900</p><h3 class=3D"highlight-title" style=3D"margin: 0; margin-bottom: 6px; font-family: 'Montserrat', sans-serif; text-decoration: none; color: #323232; font-weight: 500; font-size: 13px; line-height: 1.45;">Single</h3>
"""
soup = BeautifulSoup(html, 'lxml')


tags_with_attribute = soup.find_all(attrs=lambda x: x is not None)

clean_text = ", ".join([tag.get_text() for tag in tags_with_attribute])

Output would look like:

'$101,900, Multi, $201,900, Single'

Solution 2:^[2]

Use find_all method to find all tags:

for p, h3 in zip(soup.find_all('p'), soup.find_all('h3')):
    print("P:",p.getText())
    print("H3:",h3.getText())

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Rustam Garayev
Solution 2

'beautiful soup - get tag desired text

Solution 1:[1]

Solution 2:[2]

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]