Category "web-scraping"

How to scrape an image src using puppeteer in NodeJS?

I'm trying to scrape the source of the first image with a specific class. On the page, there are multiple images with different additional classes but they shar

C# Discord Embed Emoji Scraping

So I have set up a program that the goal of which is to run through every possible ID and test that ID to see if there is a discord emoji URL associated with it

Extract everything inside tag, but not tag itself

I'm using BeautifulSoup to scrape text from a website, but I only want the <p> tags for organization. However, I can't use text.findAll('p'), because the

How to get access to data in Github Repo with Nodejs Express

I'm currently trying to get COVID-19 from the Covid Data Repository by Johns Hopkins. https://github.com/CSSEGISandData/COVID-19 The repo get updated with new d

Selenium-Wire Your Connection Is Not Secure

I'm using selenium-wire with undetectable chromedriver and it's giving me: "Your Connection To This Site Is Not Secure" when I go into a site, and the https in

web scraping on infinite scroll using sites

I am trying to scrape a data from opensea. I can scroll page but I dont know how could collect data while scrolling. my code from selenium import webdriver fro

How would I go about incorporating an if statement in item list?

I need to find the phone numbers in this website, I have come to the conclusion that I need to write an If statement but I'm not really sure how to do that sinc

How to find element with selenium on python?

import os import selenium from selenium import webdriver import time browser = webdriver.Chrome() browser.get('https://www.skysports.com/champions-league-fixtu

Puppeteer - How to use page.click() inside page.evaluate()

I am scraping a table and each row has a button to show a modal with information. I need to scraping the information from the modal for each row but I dont know

Scraping <span> text</span> with BeautifulSoup and urllib

I want to scrape 2015 from below HTML: I use the below code but am only able to scrape "Annee" soup.find('span', {'class':'optionLabel'}).get_text() Can someo

Need the number of total pages on a website to iterate but selenium keeps timing out

i'm triying to fix a data crawler that until last couple of weeks was working perfectly. The script consist of two parts, one that retrieves the links of the ar

What is causing Error TypeError: text is not iterable? - Web scraper Puppeteer NodeJs

I am learning nodejs/puppeteer and having issues getting Puppeteer to fill UPC numbers from a CSV file onto the search bar of a book website. I managed to get a

Get Puppeteer Page/Frame Handle for new page after `ElementHandle.click()`

Using puppeteer, I have a specific page that I am web-scraping for data and screenshot-ing for proof that the data is correct. The web page itself includes a bu

Tweepy for Twitter API v2 - Extracting Additional Fields for Tweet Search

I started playing around with Twitter API v2 in Tweepy. I've had some experience with v1 but it looks like it's changed a bit. I'm trying to search tweets based

Web Scrape pagination in a single URL (cheerio and axios)

newbie here. I was on web scraping project. And I wanted some guide on web scraping pagination technique. I'm scraping this site https://www.imoney.my/unit-trus

Selenium sync with google account

i've created function using selenium undetected chromedriver in order to create a google chat with email specifeid. And every time i run my code i have to log i

Using R code to scrape data from a webpage into an Excel file

I have written a code in R which is supposed to retrieve certain information from a website and import it into an Excel file. I have used it for one website and

Not getting all the html data in the devtools on zillow website (and other)

I'm trying to scrape real estate data from zillow. When I look the html code on the devtool, most of the links of the house details are not displayed in the htm

Cannot scrape the correct aspect ration of the image - Python

I'm having a problem to extract an image from a "Manga" website using python. Below is the element example on the website: img id="comic" class="loading" onerro

HTTP error 403 in Python 3 web scraping the publications

This is the traceback of the error that is happening when I am trying to put the URL of the publication. It works for the regular websites such as Stack Overflo