'Scraping using SELENIUM/BS4
I'm trying to scrape data from this page https://www.flashscore.pl/druzyna/ajax/8UOvIwnb/tabela
How can I separate results with ";" ?? How can I choose exactly the data I need ??
the data is dynamic
Results
['1.Ajax20153261:548WWWWP']
expected result ( separate ; and miss few rows value 20 and value 48 in this example)
Ajax;15;3;2;61:5;W;W;W;W;P'
code below
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from bs4 import BeautifulSoup as BS
import requests
from time import sleep
import re
driver = webdriver.Chrome()
driver.get("https://www.flashscore.pl/druzyna/ajax/8UOvIwnb/tabela/")
sleep(10)
page = driver.page_source
soup = BS(page,'html.parser')
content3 = soup.find('div',{'class':'ui-table__body'})
content_list3 = content3.find_all('div',{'class':'tableCellFormIcon
tableCellFormIcon--TBD'})
res = []
for i in content3:
line = i.text.split()[0]
if re.search('Ajax', line):
line = line.replace("?", "")
res.append(line)
print(res)
Solution 1:[1]
Does this solve your problem?
content3 = soup.find('div',{'class':'ui-table__body'})
content_list3 = content3.find_all('div',{'class':'tableCellFormIcon tableCellFormIcon--TBD'})
content_list3 = content3.find_all('div',{'class':'tableCellFormIcon', 'title': re.compile('Ajax*')})
res = []
for i in content_list3:
line = i.text
line = line.replace("?", "")
res.append(line)
res = ";".join(res)
print(res)
Instead of uniting all text fields inside the selection and then searching for Ajax, I selected only those divs that have "Ajax" in their title and got their texts one by one. Then you can join them with a separator of preference.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | zoltankundi |
