'Web scraping from table embedded on site through R
I am currently working on one of the sites https://www.pmjdy.gov.in/Archive from which first I have to enter a date(has to be a Wednesday) and then the search result opens up into a webpage which has data stored in it with table embedded on web-page itself. Can I somehow webscrape this whole data and store it in an excel document. I have written the following code however it is erroring out. I think this might be because the this link whenever it opens up requires human input of entering any Wednesday date from calendar to get report. Can I automate this process wherein I just give input of the day in the code itself and then the data gets downloaded. Would be of great help. Thanks :)
library(RSelenium)
library(rvest)
library(tidyverse)
library(stringr)
library(purrr)
rD <- rsDriver(browser="firefox", port=4567L, verbose=F)
remDr <- rD[["client"]]
#go to the site
remDr$navigate("https://www.pmjdy.gov.in/Archive")
#get tables
tables <- remDr$findElements('class', 'table-container')
tableList <- list()
for(i in 1:length(tables)){
x <- tables[[i]]$getElementAttribute('innerHTML') %>%
unlist() %>%
read_html() %>%
html_table()
tableList[[i]] <- x[[1]]
}
system("taskkill /im java.exe /f", intern=FALSE, ignore.stdout=FALSE)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
