'How to get <link> pulled in from Yahoo Finance RSS?
I am web scrapping a RSS XML feed off of Yahoo Finance and have the below code that works quite well. My issue is that I can't get the <link> to populate. Anyone have any suggestions on how to get this working?
SummaryList = []
url = 'https://feeds.finance.yahoo.com/rss/2.0/headline?s=TSLA®ion=US&lang=en-US'
req = Request(url=url,headers={'user-agent': 'my-app/0.0.1'})
response = urlopen(req)
soup = BeautifulSoup(response, 'html.parser')
items = soup.findAll('item')
news_items = []
news_item = {}
news_item['title'] = item.title.text
news_item['description'] = item.description.text
news_item['link'] = item.link.text
news_item['pubdate'] = item.pubdate.text
news_items.append(news_item)
pd.DataFrame(news_items)
Solution 1:[1]
I had to be creative so I changed this:
news_item['link'] = item.link.text
to this:
news_item['link'] = str(item).split('<link/>', 1)[1]; news_item['link'] = news_item['link'].split('\n', 1)[0]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | E_net4 - Krabbe mit Hüten |
