'Amazon Web Scraping: AttributeError: 'NoneType' object has no attribute 'text'
I know this has been asked a million times... but I've been looking for hours and can't find a solution. I am just trying to scrape a single page on Amazon and convert the results to csv.
applelist = []
apples = soup.find_all('div', {'class': 'sg-col-4-of-12 s-result-item s-asin sg-col-4-of-16 sg-col s-widget-spacing-small sg-col-4-of-20'})
for item in apples:
apple = {
'title': item.find('span', {'class':'a-size-base-plus a-color-base a-text-normal'}).text,
'link': 'https://www.amazon.com' + item.find('a', {'class':'a-link-normal s-underline-textjknderline-link-text s-link-style a-text-normal'})['href'],
'rating': item.find('i', {'class':'a-icon-alt'}).text,
}
applelist.append(apple)
#print(applelist)
df = pd.DataFrame(applelist)
df.to_csv('file.csv')
I am able to accomplish this when I take out the 'rating' line. But when I add in 'rating' it errors out because not every item has a rating. I've tried messing around with try/except, but I think I'm having trouble because it is inside a for/in function. Any advice without making me feel too stupid would be appreciated.
Solution 1:[1]
Try testing the rating to see if it exists before you access an attribute from it.
for item in apples:
rating = item.find('i', {'class':'a-icon-alt'})
if rating:
rating = rating.text
else:
rating = 'some default value'
apple = { ... }
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Bill the Lizard |
