'Progress Bar inside a map function R - Web Scraping
I have been trying to include a progress bar inside a map function when doing a web scraping.
First, I collect all the links, which bring the results within seconds.
library(rvest)
library(dplyr)
library(stringr)
library(purrr)
news_america_mg_01 <- paste0("https://www.americamineiro.com.br/paginas/page/",
seq(from = 1, to = 4)) %>%
map(. %>%
read_html() %>%
html_nodes(".gdlr-blog-title a") %>%
html_attr("href") %>%
as.data.frame())
Second, and this is where I want to include a progress bar, I extract information of the links collected from the website.
news_america_mg_02 <- news_america_mg_01 %>%
map(. %>%
#Title
mutate(title = map_chr(., ~ read_html(.x) %>%
html_node("h1.gdlr-blog-title.entry-title") %>%
html_text()),
#Date
data = map_chr(., ~ read_html(.x) %>%
html_node(".gdlr-info .updated a") %>%
html_text()),
#Text
text = map_chr(., ~ read_html(.x) %>%
html_node(".size-large+ p") %>%
html_text())))
Thanks in advance!!
rweb-scrapingprogress-barpurrrrvestgoogle-mapsionic-frameworkgeolocationgoogle-geolocationgoogle-roads-api
Solution 1:[1]
Create a wrapper around purrr:map_chr() with one of the progress bar options. Credit: James Atkin's post
map_chr_progress <- function(.x, .f, ..., .id = NULL) {
.f <- purrr::as_mapper(.f, ...)
pb <- progress::progress_bar$new(total = length(.x), format = " [:bar] :current/:total (:percent) eta: :eta", force = TRUE)
f <- function(...) {
pb$tick()
.f(...)
}
purrr::map_chr(.x, f, ..., .id = .id)
}
Then you can use that in your dplyr chain.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Jeff Parker |
