'Python Scrapy function for i in

Hello dear Stack overflow fellows!

I'm having some trouble in this for in function.

Here's the code:

  lista1 = pd.read_excel("Produtos_para_buscar.xlsx")
  for i in lista1:
        r = requests.get(f'https://www.petz.com.br/busca?q={i}', headers=headers, params=params)
        
        response = Selector(text=r.text)
        
        products = response.xpath('//li[@class="liProduct"]')
       

I want it to iterate the entire list and return each one of the values as a response.

But it's giving me only the first value of the list.

Any ideas?



Solution 1:[1]

When indexing a pandas dataframe it's best to treat it as a dictionary especially if you have multiple columns for the dataframe.

For example:

df = {'col1':[1, 2, 3], 'col2':[3, 2, 1]}
df_pd = pd.DataFrame(df)

"""This will get you the values for that column"""

for i in df_pd['col1']:
    print(i)

Furthermore, if you want a scrapy version for this:

class yourSpider(scrapy.Spider):
    name = 'your_spider'
    lista1 = pd.read_excel("Produtos_para_buscar.xlsx")
    start_urls = []
    for i in lista1['column_you_are_after']:
    start_urls.append(f'https://www.petz.com.br/busca?q={i}')

    def start_requests(self):
        for url in self.start_urls:
            yield scrapy.Request(
                url=url,
                callback = self.parse)
  
    def parse(self, response)
        products = response.xpath('.//li[@class="liProduct"]')
        for items in products:
            yield {
                'first_item':items.xpath('....')          
  }
    

My guess is this is what you're after and it will likel work for your case. I noticed the xpath was a list so it's best to treat it like a list and that means to grab the values within a for-loop.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 dollar bill