'Scraping multiple sites in one scrapy-spider
I am scraping 6 sites in 6 different spiders. But now, I have to scrape these sites in one single spider. Is there a way of scraping multiple links in the same spider??
Solution 1:[1]
I did this by
def start_requests(self):
yield Request('url1',callback=self.url1)
yield Request('url2',callback=self.url2)
yield Request('url3',callback=self.url3)
yield Request('url4',callback=self.url4)
yield Request('url5',callback=self.url5)
yield Request('url6',callback=self.url6)
Solution 2:[2]
import spider1
import spider2
import spider3
from scrapy.crawler import CrawlerProcess
if require_spider1:
spider = spider1
urls = ['https://site1.com/']
elif require_spider2:
spider = spider2
urls = ['https://site2.com/', 'https://site2-1.com/']
elif require_spider3:
spider = spider3
urls = ['https://site3.com']
process = CrawlerProcess()
process.crawl(spider, urls=urls)
process.start()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Adeena Lathiya |
Solution 2 | insearchof |