'Is there a way to make sure RSelenium does not skip downloads?
I am trying to download multiple files over multiple pages from a website. I do not include the website in this example because it requires a login, but I hope that the code would provide enough insight into my problem.
Essentially, here are my steps:
- Go on a website.
- Search a term.
- Download 10 results per webpage into a zipped folder.
- Keep downloading until contents from all 20 webpages are in 20 zipped folders.
The problem is, I'd come back and check after the code is finished running, and there are sometimes only 18 or 19 zipped folders. It seems like I "skipped" a download.
My current code is below:
library("RSelenium")
library("rvest")
library("tidyverse")
grab_code <- function(x) {
select_all <-remDr$findElement(using = "css selector", "li.hideInMobile > input:nth-child(2)")
select_all$clickElement()
Sys.sleep(5)
select_folder <- remDr$findElement(using = "css selector", ".expandable > ul:nth-child(1) > li:nth-child(3) > button:nth-child(1) > span:nth-child(1)")
select_folder$clickElement()
Sys.sleep(10)
download <- remDr$findElement(using = "css selector", "div.button-group:nth-child(2) > button:nth-child(1)")
download$clickElement()
Sys.sleep(30)
pages <- remDr$findElement(using = "css selector", "a.la-TriangleRight")
pages$clickElement()
Sys.sleep(30)
}
i <- 1
while (i < 44) {
grab_code(i)
i = i + 1
}
In the code, I indicate that I want Selenium to stop downloading on page 20 as noted by i < 21. The problem is, when I check my Downloads folder, I sometimes see less than 20 downloads, so one of the pages (or in some instances, two of the pages) is missing. I've tried to add more sleep time between tasks to ensure that the browser waits until a file is fully downloaded before moving on to the next page, but it appears that it hasn't worked. I've also tried to make the web results more sparse---less stuff on the page---in order to make sure that the page itself won't lag.
If anyone has any guidance on how to prevent the the code from skipping downloads, I would very much appreciate it.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
