'difficulty reading online xml content with python, xml.etree.ElementTree and urllib
I am reading XML online in an RSS feed using python, xml.etree.ElementTree and urllib. My code seems to be straightforward but is not giving me the results that I want No matter what I do it always returns what looks like all the data in the XML stream
I am open to better suggestions on how to read specific strings into lists
see my code below
import xml.etree.ElementTree as ET
from urllib import request
title_list = []
def main():
try:
response = request.urlopen("https://www.abcdefghijkl.xml")
rsp_code = response.code
print(rsp_code)
if rsp_code == 200:
webdata = response.read()
print("1")
xml = webdata.decode('UTF-8')
print("2")
tree = ET.parse(xml)
print("3")
items = tree.findall('channel')
print("4")
for item in items:
title = item.find('title').text
title_list.append(title)
print(f"title_list 0 is, {title_list}")
print("5")
except Exception as e:
print(f'An error occurred {str(e)}')
main()
Solution 1:[1]
Thanks, everyone, I figured it out after an awesome Udemy video. I eventually used the bs4 library(beautiful soup)python library and requests. Heres the code below
import bs4
import requests
title_list = []
def main():
try:
result = requests.get("https://abcdefghijk.xml")
res_text = result.text
soup = bs4.BeautifulSoup(res_text, features="xml")
title_tag_list = soup.select('title')
for titles in title_tag_list:
title = titles.text
title_list.append(title)
print(f"title_list 0 is, {title_list}")
print("5")
except Exception as e:
print(f'An error occurred {str(e)}')
main()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | MIike Eps |
