Category "web-scraping"

Scroll inside div doesn't wrk with Puppeteer

I try to scroll area inside div using Puppeteer. I tried to follow these answers: https://stackoverflow.com/a/67490337 and https://stackoverflow.com/a/52031392

I've been trying to scrape profile pictures in Instagram using this code but i keep getting TypeError: 'NoneType' object is not subscriptable

import requests from bs4 import BeautifulSoup as bs User = input("input the username of the user "); url = 'https://instagram.com/' + User +'/' r = requests.get

Handling this website which is redirecting to the same url with BeautifulSoup

So I'm scrapping this website: https://www.hepsiburada.com/apple-macbook-pro-m1-cip-8gb-256gb-ssd-macos-13-qhd-tasinabilir-bilgisayar-uzay-grisi-myd82tu-a-p-HBV

Multiple possible values when searching item by XPath | Scrapy

I would like to find title bar icon with rel = 'icon' or 'shortcut icon'. So I'm trying to do something like this: response.xpath("head/link[@rel='icon' or 'sho

BeautifulSoup request is returning an empty list from LinkedIn.com/jobs

I'm new to BeautifulSoup and web scraping so please bare with me. I'm using Beautiful soup to pull all job post cards from LinkedIn with the title "Security Eng

Click on hyperlink when scraping data from a table

I am trying to scape data from a web table. I login to a website, which gives me access to a web table. That table contains a hyperlink column (Id) like the exa

Python & Selenium: How to get Elements in DevTools with CDP (Chrome DevTools Protocol)

I'd like to get all source code in Elements with Chrome DevTools. Although I tried the following code, these values are not match with the above code. body = d

i want to scrape another class if first class is not found (n/a) in beautifulsoup, how to code this?

I'm scraping Indiegogo to see how many backers there are. However, because there are two different formats, it first scrapes the content for the first layout, b

How to Get web data based on link in excel cell?

I'd like to create an Excel sheet, where in one column there is a link to a website like this: Link in column A where there is a MAC add in that url that chang

Scraping data with BeautifulSoup and Selenium

I am using BeautifulSoup and Selenium to extract web data (beautifulsoup to parse the HTML page and Selenium to click Next to get to the next list of items on t

Dynamic(with mouseover/coordinates) web scraping python unable to extract information

I'm trying to scrape the data that only appears on mouseover(selenium). It's a concert map and this is my entire code. I keep getting TypeError: 'ActionChains'

Python Login to UPS.com returns 403

I had a script that would login to my UPS.com account to receive all incoming packages. The following code was working for a while but not anymore: import reque

How to get product id and UPC in page source in Target?

I am trying to scrape some product ID and UPC of products in Target using Selenium in Python. I cannot find product id and UPC in product page so i go to the pa

error 403 when scraping Hansard which uses Cloudflare

I am trying to extract a graph from this link. I need to write a loop to extractd the info of graphs like this for a set of specific criteria. Using Developers

Scraping different years from Tableau

I have to scrape this table but it seems that TableauScraper does not recognise that multiple years are available. Here is the Table https://public.tableau.com/

How to scrape similar/related accounts from instagram in python?

I am trying to scrape accounts which are similar/related to a given account in instagram. Querying URLS: https://www.instagram.com/{username}/?_a=1 doesn't prov

Bash sed command issue

I'm trying to further parse an output file I generated using an additional grep command. The code that I'm currently using is: ##!/bin/bash # fetches the links

How to scrape data from Twitter without its API using BeatifulSoup

I'm currently trying to scrape some data from Twitter, like username, screen name, the content of the tweet etc. But I've run into some problems: I've been tryi

find_all() prints everythigh twice

I just started my first Web scraping project and out of some reason when I try to run this simple code, it prints all of the headlines twice. I have no Idea why

Scraping multiple sites in one scrapy-spider

I am scraping 6 sites in 6 different spiders. But now, I have to scrape these sites in one single spider. Is there a way of scraping multiple links in the same