'Is there a better way of finding something when webscraping
I was curious as whether there is a better way of finding an element when webscraping instead of using xpath as websites are likely to change and i'd prefer if my program was good enough to withstand a few small updates in the html. Currently I am finding the most recent transaction made on an account and just scraping a few bits of info such as price and who from. It is all working just want a more efficient way.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
import asyncio,discord,datetime
from discord.ext import commands
url = "https://etherscan.io/address/0x7ef865963d3a005670b8f8df6aed23e456fa75e0"
xpaths = ['//*[@id="transactions"]/div[2]/table/tbody/tr[1]/td[10]','//*[@id="transactions"]/div[2]/table/tbody/tr[1]/td[7]/a']
def main():
driver_options = Options()
driver_options.add_argument("--headless")
driver = webdriver.Chrome("Drivers\Chromedriver.exe")
driver.get(url)
results = {}
try:
for i in xpaths:
element = WebDriverWait(driver,10).until(EC.presence_of_element_located((By.XPATH,i)))
results[i] = element.get_attribute("textContent")
finally:
driver.quit()
with open("CurrentTopBidder.txt","r") as f:
info = f.readlines()
info = info[0].split(",")
f.close()
if int(results[xpaths[0]][:-6:]) > int(info[0][:-6:]):
print("higher")
current_time = datetime.datetime.now().strftime("%H:%M")
with open("CurrentTopBidder.txt","w") as f:
f.write(f"{results[xpaths[0]]},{results[xpaths[1]]},{current_time}")
f.close()
print(results)
driver.quit()
if __name__ == "__main__":
main()
This code should work perfectly to get the information. If there are any issues, comment, and I will see if anything must be changed.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
