'Scrapy and python: DNS lookup failed: no results for hostname lookup - proxy issue?

I am trying to use Scrapy and Python to scrape some pages from within my company's IT and network. I started by using the scrapy tutorial from here https://doc.scrapy.org/en/latest/intro/tutorial.html

When I try the code identical to the one on the tutorials page, I get the error:

2018-01-24 11:49:04 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET http://quotes.toscrape.com/robots.txt> (failed 1 times): DNS lookup failed: no results for hostname lookup: quotes.toscrape.com.

Thus, I tried to set up my proxy server to get a connection, which I also have to do to use pip install (just as an example). I did this by changing the code of the tutorial, using Amom's approach from Scrapy and proxies:

import scrapy
class QuotesSpider(scrapy.Spider):
    name = "quotes"
    def start_requests(self):
        urls = [
            'http://quotes.toscrape.com/page/1/',
            'http://quotes.toscrape.com/page/2/',
        ]
        for url in urls:
            request = scrapy.Request(url=url, callback=self.parse)
            request.meta['proxy'] = "user@proxy:port"
            yield request

    def parse(self, response):
        page = response.url.split("/")[-2]
        filename = 'quotes-%s.html' % page
        with open(filename, 'wb') as f:
        f.write(response.body)
        self.log('Saved file %s' % filename)

Does somebody how on how to solve this? I really need to get this to work. Thanks in advance.



Solution 1:[1]

that means they are blocking scrapy i.e., they are not allowing anyone to scrape their website. I'm sorry, you can't do anything about it.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 veera sekhar