'XPATH issue while looping through tags
I have this piece of code, where I try to download these papers but the loop prints the first element only.
import scrapy from urllib.parse import urljoin
class SimpleSpider(scrapy.Spider): name = 'simple' start_urls = ['https://jmedicalcasereports.biomedcentral.com/articles?query=COVID-19&searchType=journalSearch&tab=keyword']
def parse(self, response):
for book in response.xpath('//*[@id="main-content"]/div/main/div[2]/ol'):
title= response.xpath('/li[3]/article/h3/a/text()').get()
link = urljoin(
'https://jmedicalcasereports.biomedcentral.com/',response.xpath('/li[3]/article/ul/li[2]/a/@href').get()
)
yield {
'Title':title,
'file_urls':[link]
}
I used css, and then xpath, problem is with loop code.
Solution 1:[1]
Firstly, in the third line of your code, response could be changed to title
title= book.xpath('.//a/text()').get()
Secondly, in your second line, you give an incorrect xpath. So the result is not correct. This is my code. Hope this can help you.
def parse(self, response):
for book in response.xpath('//li[@class = "c-listing__item"]'):
title= book.xpath('.//a/text()').get()
link = urljoin(
'https://jmedicalcasereports.biomedcentral.com/',book.xpath('.//a/@href').get()
)
yield {
'Title':title,
'file_urls':[link]
}
The response is :
{'Title': 'Presentation of COVID-19 infection with bizarre behavior and
encephalopathy: a case report', 'file_urls':
['https://jmedicalcasereports.biomedcentral.com/articles/10.1186/s13256-021-
02851-0']}
2022-04-17 21:54:27 [scrapy.core.scraper] DEBUG: Scraped from <200
https://jmedicalcasereports.biomedcentral.com/articles?query=COVID-
19&searchType=journalSearch&tab=keyword>
{'Title': 'Dysentery as the only presentation of COVID-19 in a child: a\xa0case
report', 'file_urls':
['https://jmedicalcasereports.biomedcentral.com/articles/10.1186/s13256-021-
02672-1']}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | studymakesmebetter |
