'scraping yell with python requests gives 403 error

I have this code

from requests.sessions import Session
url = "https://www.yell.com/s/launderettes-birmingham.html"

s = Session()
headers = {
    'user-agent':"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36",
}
r = s.get(url,headers=headers)
print(r.status_code)

but I get 403 output, instead 200

I can scrape this data with selenium, but is there a way to scrape this with requests



Solution 1:[1]

If you modify your code like so:

print(r.text)
print(r.status_code)

you will see, that the reason you are getting a 400 error code is due to yell using Cloudflare browser check.

As it uses javascript, there is no way to reliably use the requests module.

Since you mentioned you are going to use selenium, make sure to use the undetected driver package Also, be sure to rotate your IP to avoid getting your IP blocked.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Zyy