'How do I scrap a JS popup table with Selenium?
The goal: need to select a option on a dropdown menu then when a list gets pasted below I need to click on each one iteratively and scrap all the given data. Thankfully classes have proper ID names so should be doable but am facing some issues as described below
Can better understand it if you visit the website here www.psx.com.pk/psx/resources-and-tools/listings/listed-companies
Messy code:
chromedriver = "chromedriver.exe"
driver = webdriver.Chrome(chromedriver)
driver.get("https://www.psx.com.pk/psx/resources-and-tools/listings/listed-companies")
select = Select(driver.find_element_by_id("sector"))
for opt in select.options: #this will loop through all the dropdown options from the site
opt.click() #in source code table class gets populated here
table = driver.find_elements_by_class_name("addressbook")
for index in range(len(table)):
# if index % 2 == 0:
elem = table[index].text
print(elem)
elem.click()
data = driver.find_elements_by_class_name("addressbookdata")
print(data)
If you run this code on your end the output is very erratic, if everything work correctly I will get Index/Company names in my table.text variable so thought a quick and dirty solution to just get IDs would be to % 2 the index instead of populating a df first and then dropping the duplicates. After I've gotten all the IDs I need to click on all of them and then extract and append the data from ID addressbookdata into a dataframe whole, I don't think theres any logical problem in my code right now? But I can't make this work, its my first time using selenium as well am much more comfortable with beautifulsoup
Solution 1:[1]
I select dropdown table by value and pull table data selenium with pandas
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import Select
driver = webdriver.Chrome(ChromeDriverManager().install())
url = 'https://www.psx.com.pk/psx/resources-and-tools/listings/listed-companies'
driver.get(url)
driver.maximize_window()
wait = WebDriverWait(driver,30)
#select from dropdown pop up option
Select(Wait.until(EC.visibility_of_element_located((By.XPATH, "//select[@id='sector']")))).select_by_value("0801")
dptable = wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@class="table-responsive"]'))).get_attribute("outerHTML")
df = pd.read_html(dptable)
print(df)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | F.Hoque |
