'BeautifulSoup: Can't convert NavigableString to string
I'm starting to learn Python and I've decided to code a simple scraper. One problem I'm encountering is I cannot convert a NavigableString to a regular string.
Using BeautifulSoup4 and Python 3.5.1. Should I just bite the bullet and go to an earlier version of Python and BeautifulSoup? Or is there a way I can code my own function to cast a NavigableString to a regular unicode string?
for tag in soup.find_all("span"):
for child in tag.children:
if "name" in tag.string: #triggers error, can't compare string to NavigableString/bytes
return child
#things i've tried:
#if "name" in str(tag.string)
#if "name" in unicode(tag.string) #not in 3.5?
#if "name" in strring(tag.string, "utf-8")
#tried regex, didn't work. Again, doesn't like NavigableSTring type.
#... bunch of other stuff too!
Solution 1:[1]
For Python 3...
... the answer is merely str(tag.string)
Other answers will fail.
unicode() is not a built-in in Python 3.
tag.string.encode('utf-8') will convert the string to a byte string, which you don't want..
Solution 2:[2]
You can do this:
unicode(tag.string)
Solution 3:[3]
I came up to this question and got it solved best by the answer of Mark Ramson from How to remove this \xa0 from a string in python? with
import unidecode
word = unidecode.unidecode(tag.string)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Stephen Rauch |
| Solution 3 | marco |
