'Cannot extract/load all hrefs from iframe (inside html page) while parsing Webpage
l am really struggling with this case and have been trying all day. Please l need your help.I am trying to scrape this webpage: https://decisions.scc-csc.ca/scc-csc/en/d/s/index.do?cont=&ref=&d1=2012-01-01&d2=2022-01-31&p=&col=1&su=16&or= l want to get all 137 href-s (137 documents). The code l used:
def test(self):
final_url = 'https://decisions.scc-csc.ca/scc-csc/en/d/s/index.do?cont=&ref=&d1=2012-01-01&d2=2022-01-31&p=&col=1&su=16&or='
self.driver.get(final_url)
soup = BeautifulSoup(self.driver.page_source, 'html.parser')
iframes = soup.find('iframe')
src = iframes['src']
base = 'https://decisions.scc-csc.ca/'
main_url = urljoin(base, src)
self.driver.get((main_url))
browser = self.driver
elem = browser.find_element_by_tag_name("body")
no_of_pagedowns = 20
while no_of_pagedowns:
elem.send_keys(Keys.PAGE_DOWN)
time.sleep(0.2)
no_of_pagedowns -= 1
The problem is that it loads only 25 first documents (href) and don't know how to do that.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
