'How to this error ? utf-8' codec can't decode byte 0xef in position 32887: invalid continuation byte

enter image description here

Hello. I am trying to open this file which is in .txt format but it gives me an error.



Solution 1:[1]

Sometimes when you don't have uniform files you have to by specific with the correct encoding, You should indicate it in function open for example,

with open(‘file.txt’, encoding = ‘utf-8’) as f:
    etc

also you can detect the file encoding like this:

from chardet import detect

with open(file, 'rb') as f:
    rawdata = f.read()
    enc = detect(rawdata)['encoding']

with open(‘file.txt’, encoding = enc) as f:
    etc

Result:

>>> from chardet import detect
>>>
>>> with open('test.c', 'rb') as f:
...     rawdata = f.read()
...     enc = detect(rawdata)['encoding']
...
>>> print(enc)
ascii

Python 3.7.0

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1