'scrapy spider code not running because of syntax?
so my projects seem to keep failing for the same reason. I get syntax error. I'm using anaconda and visual code studio, I have the environment setup correctly, i think*.
The code i'm using is the following:
import scrapy
from scrapy.linkextractors import LinkExtractor
from scrapy.spiders import CrawlSpider, Rule
class BestMoviesSpider(CrawlSpider):
name = 'best_movies'
allowed_domains = ['imdb.com']
start_urls = ['https://www.imdb.com/chart/top']
rules = (
Rule(LinkExtractor(restrict_xpaths="//td[@class='titleColumn']/a"), callback='parse_item', follow=True),
)
def parse_item(self, response):
yield {
'title': response.xpath("//h1/text()").get(),
'year': response.xpath("//li[@class="ipc-inline-list__item"]/span/text()").get(),
'duration': response.xpath("(//li[@class="ipc-inline-list__item"])[3]/text()").get(),
'genre': response.xpath("//span[@class="ipc-chip__text"]/text()").get(),
'rating': response.xpath("//span[@class="AggregateRatingButton__RatingScore-sc-1ll29m0-1 iTLWoV"]/text()").get(),
'movie_url': response.url,
}
The error I'm getting is : line 18 'year': response.xpath("//li[@class="ipc-inline-list__item"]/span/text()").get(), ^ SyntaxError: invalid syntax
Also, I have 2 errors on VSC regarding { and ( not being closed but I think that's because my code isn't running.
Thank you in advance!
Solution 1:[1]
The issue is in the string definition of the xpaths.
You can just use single quotes and you should be fine:
# not
'year': response.xpath("//li[@class="ipc-inline-list__item"]/span/text()").get(),
# Instead use
'year': response.xpath('//li[@class="ipc-inline-list__item"]/span/text()').get(),
Solution 2:[2]
It seems to be a problem of quotes:
Try to replace year: response.xpath("//li[@class="ipc-inline-list__item"]/span/text()").get()
by year: response.xpath('//li[@class="ipc-inline-list__item"]/span/text()').get()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Davide Laghi |
| Solution 2 | Takamura |
