'Scraping in <dt> and <dd> tags with bs4 ant python

How should i extract info i only need from <dt> and <dd> tags ? P.S and there is a lot of pages like that - hundreds
Here is link for main page:
https://www.aruodas.lt/butai/vilniuje/
and link for child page into it:
https://www.aruodas.lt/butai-vilniuje-santariskese-dangerucio-g-parduodamas-7385-kv-m-triju-kambariu-butas-1-3172400/

My desired output should look like that:

Plotas: 22 m2
Kambariu_skaicius: 4
Metai: 2022 

Code block, iam using is:

import pandas as pd
from selenium import webdriver
from bs4 import BeautifulSoup
import re
import time

PATH = 'C:\Program Files (x86)\chromedriver.exe'
driver = webdriver.Chrome(PATH)


for puslapis in range(2, 3):
    driver.get(f'https://www.aruodas.lt/butai/vilniuje/puslapis/{puslapis}')
    response = driver.page_source
    soup = BeautifulSoup(response, 'html.parser')
    blocks = soup.find_all('tr', class_= 'list-row')

    stored_urls = []

    for url in blocks:
        try:
            stored_urls.append(url.a['href'])
        except:
            pass

    for link in stored_urls:
        driver.get(link)
        response = driver.page_source
        soup = BeautifulSoup(response, 'html.parser')

        try:
            #Reikia su RegEx sutvarkyti adresa
            adress = soup.find('h1','obj-header-text').text.strip()
            # print(adress)
        except:
            adress = 'n/a'


            def get_dl(soup):
                keys, values = [], []
                for dl in soup.findAll("dl", {"class": "obj-details"}):
                    for dt in dl.findAll("dt"):
                        keys.append(dt.text.strip())
                    for dd in dl.findAll("dd"):
                        values.append(dd.text.strip())
                return dict(zip(keys, values))

            dl_dict = get_dl(soup)
            print(dl_dict)      

So, in this case i can get all info, which is in dd and dt tags, but i need information, which is in picture below

This is html source :

enter image description here



Solution 1:[1]

Pull them into a list, then use zip.

from bs4 import BeautifulSoup


html = '''<dl class="obj-details  ">
<dt> Namo numeris: </dt>
<dd> 27 </dd>
<hr class="clear">
<dt> Buto numeris: </dt>
<dd> 6 </dd>
<hr class="clear">
<dt> Other: </dt>
<dd> 42 </dd>
<hr> class="clear">'''


soup = BeautifulSoup(html, 'html.parser')
dt = [x.text.strip() for x in soup.find_all('dt')]
dd = [x.text.strip() for x in soup.find_all('dd')]

myList = list(zip(dt, dd))

for each in myList:
    print(each[0], each[-1])

Output:

Namo numeris: 27
Buto numeris: 6
Other: 42

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 chitown88