'Scraping Xpath lxml blank/empty returned list
Kind of a noob here. Not my first time webscrapping but this one gives me headaches:
Using lxml, I'm trying to scrape some data from a webpage... I managed to extract some data with other websites but I got trouble with this one.
I'm trying to get the value "44 kg CO2-eq/m2" on this website here:
import lxml.etree
from lxml import html
import requests
# Request the page
page = requests.get('https://www.bs2.ch/energierechner/#/?d=%7B%22area%22%3A%22650%22,%22floors%22%3A%224%22,%22utilization%22%3A2,%22climate%22%3A%22SMA%22,%22year%22%3A4,%22distType%22%3A2,%22dhwType%22%3A1,%22heatType%22%3A%22air%22,%22pv%22%3A0,%22measures%22%3A%7B%22walls%22%3Afalse,%22windows%22%3Afalse,%22roof%22%3Afalse,%22floor%22%3Afalse,%22wrg%22%3Afalse%7D,%22prev%22%3A%7B%22walls%22%3Afalse,%22wallsYear%22%3A1,%22windows%22%3Afalse,%22windowsYear%22%3A1,%22roof%22%3Atrue,%22roofYear%22%3A1,%22floor%22%3Afalse,%22floorYear%22%3A1%7D,%22zipcode%22%3A%228055%22%7D&s=4&i=false')
tree = html.fromstring(page.content)
scraped_text = tree.xpath(
'//*[@id="bs2-main"]/div/div[2]/div/div[2]/div[4]/div/div[2]/div[3]/div[2]/div[2]/div/div[2]/div[1]')
print(scraped_text)
From the print argument, i just get a blank list [] as returned value, and not the value I am looking for.
I also tried to used the long XPath, although I now that it is not optimal, because dependend of eventuell changes on the site's structure.
scraped_text = tree.xpath(
'/html/body/div[1]/div/div[5]/main/div[3]/div/div[2]/div/div[2]/div[4]/div/div[2]/div[3]/div[2]/div[2]/div/div[2]/div[1]')
print(scraped_text)
From this XPath, I also get an empty list [] from the print argument.
I checked the correct XPath using "XPath Helper" on Chrome. I also tried to use BeautifulSoup but without any luck, as it doesn't manage XPaths.
I found a similar problem on Stackoverflow here : Empty List LXML XPATH
As it appear that my XPath is probably wrong defined. I tried since days to solve this, any help would be nice, thanks!
Edit: I also tried to get another XPath using ChroPath, but I got this feedback:
It might be child of svg/pseudo element/comment/iframe from different src. Currently ChroPath doesn't support for them.
I presume my XPath may be wrong.
Solution 1:[1]
You can't find the element because you use requests and the requests don't load JavaScript and this page is loading by javascript.You must switch on Selenium WebDriver
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | mikebrucks |
