'Scrapy csv pipeline outputs each element in one column
I am scraping product information from amazon wit scrapy and try to store it in a csv through a pipeline in pipelin.csv. My pipeline outputs a csv as seen in the picture link below, where each scraped information is stored in a new row in the same column. Anyone an idea what coulod be wrong with my pipeline? Thanks in advance!
spider:
def parse(self, response):
for product in response.css('div.a-container'): #response.css('div.a-section.review-views.celwidget'):
items = AmazonTestItem()
rating = product.css('a.a-link-normal::attr(title)').extract()
price = product.css('span.a-price.aok-align-center.reinventPricePriceToPayPadding.priceToPay span.a-offscreen::text').extract()
#items['review_text'] = review_text
items['Rating'] = rating
items['Price'[\] ][1]= price
yield items
pipeline:
class AmazonTestPipeline:
def process_item(self, item, spider):
datetime_now = datetime.datetime.now()
datetime_string = datetime_now.strftime("%d/%m/%Y %H:%M:%S")
with open('amazon.csv','w', encoding='UTF8', newline='') as csv_1:
csv_out = csv.writer(csv_1)
csv_out.writerow(['Rating'])
csv_out.writerow(['Price'])
csv_out.writerows([item['Rating'][index]] for index in range(0, len(item['Rating'])))
csv_out.writerows([item['Price'][index]] for index in range(0, len(item['Price'])))
csv_out.writerow([datetime_string])
return item
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
