'Python Scrapy function for i in
Hello dear Stack overflow fellows!
I'm having some trouble in this for in function.
Here's the code:
lista1 = pd.read_excel("Produtos_para_buscar.xlsx")
for i in lista1:
r = requests.get(f'https://www.petz.com.br/busca?q={i}', headers=headers, params=params)
response = Selector(text=r.text)
products = response.xpath('//li[@class="liProduct"]')
I want it to iterate the entire list and return each one of the values as a response.
But it's giving me only the first value of the list.
Any ideas?
Solution 1:[1]
When indexing a pandas dataframe it's best to treat it as a dictionary especially if you have multiple columns for the dataframe.
For example:
df = {'col1':[1, 2, 3], 'col2':[3, 2, 1]}
df_pd = pd.DataFrame(df)
"""This will get you the values for that column"""
for i in df_pd['col1']:
print(i)
Furthermore, if you want a scrapy version for this:
class yourSpider(scrapy.Spider):
name = 'your_spider'
lista1 = pd.read_excel("Produtos_para_buscar.xlsx")
start_urls = []
for i in lista1['column_you_are_after']:
start_urls.append(f'https://www.petz.com.br/busca?q={i}')
def start_requests(self):
for url in self.start_urls:
yield scrapy.Request(
url=url,
callback = self.parse)
def parse(self, response)
products = response.xpath('.//li[@class="liProduct"]')
for items in products:
yield {
'first_item':items.xpath('....')
}
My guess is this is what you're after and it will likel work for your case. I noticed the xpath was a list so it's best to treat it like a list and that means to grab the values within a for-loop.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | dollar bill |
