'How can I read HTML using Selenium in Python when I cannot see the whole page?
I want to use some text in a Smartsheet page/grid, but when I try to access it using Selenium (finding the element by id and copying the text to a variable), the text only includes the cells of the grid that fit in the web window. So, for example, the code I have will grab all columns in this image, up to column 7, and all rows up to 20. But column 8 and rows after 20, seen here, will not be grabbed by the code:
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.maximize_window()
driver.get("https://app.smartsheetgov.com/sheets/54hPp4faksdjfhk46jJwMqqWxChg55GXG6Jmc1?view=grid")
# this will loop until the user logs in to the website
column_names_object = None
while column_names_object == None:
try:
column_names_object = WebDriverWait(driver, 30).until(EC.presence_of_element_located((By.ID, "foid:23")))
except TimeoutException:
print('Please login to Smartsheet')
column_names = column_names_object.text.split("\n")
number_of_columns = len(column_names)
grid = driver.find_element(By.ID,"foid:18")
grid_rows = grid.text.split("\n ")
number_of_rows = len(grid_rows)
How do I get the text that is in the rest of the spreadsheet without having Selenium scroll and read the element multiple times?
I have tried zooming out to 26%, it somewhat works when I do it manually while debugging, but not when running the code by itself (with driver.execute_script("document.body.style.zoom='26%'"))
To be clear, the code does get the text attribute of the column_names_object, the problem is that the object does not have the whole text. The same with the grid object.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|