'Webscraper Returning Blank Results (IDE issue ? )
I was assisted with below code for a webscraper by one of the very helpful chaps on here however - it has all of a sudden stopped returning results. It either returns blank set() or nothing at all.
Does the below work for you ? Need to know if it's an issue with my IDE as it doesn't make any sense for it to be working one minute then giving random results the next when no amends was made to the code.
from requests_html import HTMLSession
import requests
def get_source(url):
try:
session = HTMLSession()
response = session.get(url)
return response
except requests.exceptions.RequestException as e:
print(e)
def scrape_google(query, start):
response = get_source(f"https://www.google.co.uk/search?q={query}&start={start}")
links = list(response.html.absolute_links)
google_domains = ('https://www.google.',
'https://google.',
'https://webcache.googleusercontent.',
'http://webcache.googleusercontent.',
'https://policies.google.',
'https://support.google.',
'https://maps.google.')
for url in links[:]:
if url.startswith(google_domains):
links.remove(url)
return links
data = []
for i in range(3):
data.extend(scrape_google('best place', i * 10))
print(set(data))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
