'XPATH issue while looping through tags

I have this piece of code, where I try to download these papers but the loop prints the first element only.

import scrapy from urllib.parse import urljoin

class SimpleSpider(scrapy.Spider): name = 'simple' start_urls = ['https://jmedicalcasereports.biomedcentral.com/articles?query=COVID-19&searchType=journalSearch&tab=keyword']

def parse(self, response):
   
    for book in response.xpath('//*[@id="main-content"]/div/main/div[2]/ol'):
       

        title= response.xpath('/li[3]/article/h3/a/text()').get()
        link = urljoin(
          'https://jmedicalcasereports.biomedcentral.com/',response.xpath('/li[3]/article/ul/li[2]/a/@href').get()
        )
        yield {
            'Title':title,
            'file_urls':[link]
        }

I used css, and then xpath, problem is with loop code.



Solution 1:[1]

Firstly, in the third line of your code, response could be changed to title

title= book.xpath('.//a/text()').get()

Secondly, in your second line, you give an incorrect xpath. So the result is not correct. This is my code. Hope this can help you.

    def parse(self, response):
      for book in response.xpath('//li[@class = "c-listing__item"]'):
        title= book.xpath('.//a/text()').get()
        link = urljoin(
        'https://jmedicalcasereports.biomedcentral.com/',book.xpath('.//a/@href').get()
        )
        yield {
            'Title':title,
            'file_urls':[link]
        }

The response is :

{'Title': 'Presentation of COVID-19 infection with bizarre behavior and 
encephalopathy: a case report', 'file_urls': 
['https://jmedicalcasereports.biomedcentral.com/articles/10.1186/s13256-021- 
02851-0']}
2022-04-17 21:54:27 [scrapy.core.scraper] DEBUG: Scraped from <200 
https://jmedicalcasereports.biomedcentral.com/articles?query=COVID- 
19&searchType=journalSearch&tab=keyword>
{'Title': 'Dysentery as the only presentation of COVID-19 in a child: a\xa0case 
report', 'file_urls': 
['https://jmedicalcasereports.biomedcentral.com/articles/10.1186/s13256-021- 
02672-1']}
        

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 studymakesmebetter