'Why is request.get giving different results from my browser? (webscraping)
I'm very new to python and I'm trying to make a web scraper that collects ads that people have posted on this dutch site that you can use for selling your old stuff. First, I ask the user for a search_term, a distance and a postal_code. Then, these three variables are inserted into the URL and the request.get is made. See the function declaration below:
from bs4 import BeautifulSoup
import requests
BASE_URL = 'https://www.marktplaats.nl'
def find_ads(BASE_URL, search_term, distance, postal_code):
new_url = f"{BASE_URL}/q/{search_term}/#distanceMeters:{distance * 1000}|postcode:{postal_code}"
html_text = requests.get(new_url, timeout=5).text
soup = BeautifulSoup(html_text, 'lxml')
ads = soup.find_all('li', class_='mp-Listing mp-Listing--list-item')
print(new_url)
for index, ad in enumerate(ads):
ad_name = ad.find('h3', class_='mp-Listing-title').text
ad_location = ad.find('span', class_='mp-Listing-location').text
print(ad_location + "\n")
print(ad_name + "\n")
search_term = "bike"
distance = 3
postal_code = 1234ab
find_ads(BASE_URL, search_term, distance, postal_code)
The url that is created (new_url) is exactly like I want it. When I copy new_url to my browser, I get the page that I want to scrape, with my location and the distance set.
HOWEVER, when I look at the ad_location, I expect them all to be the city I am in currently, but what I see in my console is random cities scattered across the Netherlands. This means that the page is ignoring my location somehow.
My question: How come requests.gets gets a different result than when I use the same URL in my browser? Why is it ignoring the distance and postal_code I feed into it?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
