'How to solve UnicodeDecodeError in Python 3.6?
I am switched from Python 2.7 to Python 3.6.
I have scripts that deal with some non-English content.
I usually run scripts via Cron and also in Terminal.
I had UnicodeDecodeError in my Python 2.7 scripts and I solved by this.
# encoding=utf8
import sys
reload(sys)
sys.setdefaultencoding('utf8')
Now in Python 3.6, it doesnt work. I have print statements like print("Here %s" % (myvar)) and it throws error. I can solve this issue by replacing it to myvar.encode("utf-8") but I don't want to write with each print statement.
I did PYTHONIOENCODING=utf-8 in my terminal and I have still that issue.
Is there a cleaner way to solve UnicodeDecodeError issue in Python 3.6?
is there any way to tell Python3 to print everything in utf-8? just like I did in Python2?
Solution 1:[1]
I had this issue when using Python inside a Docker container based on Ubuntu 18.04. It appeared to be a locale issue, which was solved by adding the following to the Dockerfile:
ENV LANG C.UTF-8
Solution 2:[2]
To everyone using pickle to load a file previously saved in python 2 and getting an UnicodeDecodeError, try setting pickle encoding parameter:
with open("./data.pkl", "rb") as data_file:
samples = pickle.load(data_file, encoding='latin1')
Solution 3:[3]
For a Python-only solution you will have to recreate your sys.stdout object:
import sys, codecs
sys.stdout = codecs.getwriter('utf-8')(sys.stdout.detach())
After this, a normal print("hello world") should be encoded to UTF-8 automatically.
But you should try to find out why your terminal is set to such a strange encoding (which Python just tries to adopt to). Maybe your operating system is configured wrong somehow.
EDIT: In my tests unsetting the env variable LANG produced this strange setting for the stdout encoding for me:
LANG= python3
import sys
sys.stdout.encoding
printed 'ANSI_X3.4-1968'.
So I guess you might want to set your LANG to something like
en_US.UTF-8. Your terminal program doesn't seem to do this.
Solution 4:[4]
Python 3 (including 3.6) is already Unicode supported. Here is the doc - https://docs.python.org/3/howto/unicode.html
So you don't need to force Unicode support like Python 2.7. Try to run your code normally. If you get any error reading a Unicode text file you need to use the encoding='utf-8' parameter while reading the file.
Solution 5:[5]
for docker with python3.6, use LANG=C.UTF-8 python or jupyter xxx works for me, thanks to @Daniel and @zhy
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Daniel |
| Solution 2 | Mark Storm |
| Solution 3 | |
| Solution 4 | ananto30 |
| Solution 5 | zhibo |
