'How to handle unicode directly with pandas.read_xml?

I have an .xml from an online source and want to read the XML directly into python. I do use the pandas command

pd.read_xml(url)

However. I get the error:

File "<string>", line 3300
lxml.etree.XMLSyntaxError: PCDATA invalid Char value 26, line 3300, column 15

Inpecting opening the dataset, I see the line has a special character(PyCharm shows a [SUB] between the whitespaces after XETRA):

<column>XETRA  Regulierter Markt</column>

Can I handle these special characters directly in pandas? Or do I need to download the set beforehand and clean it up? How could I clean the XML from unicode characters?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source