'Convert Python Bytes to String Without Encoding
I am using Python 3.6 and I have an image as bytes:
img = b'\xff\xd8\xff\xe0\x00\x10JFIF\x00'
I need to convert the bytes into a string without encoding so it looks like:
raw_img = '\xff\xd8\xff\xe0\x00\x10JFIF\x00'
The goal is to incorporate this into an html image tag:
<img src="'data:image/png;base64," + base64.b64encode(raw_img) + "' />"
Solution 1:[1]
Why not just call str and remove the b after?
In:
str(img)[2:-1]
Out:
'\xff\xd8\xff\xe0\x00\x10JFIF\x00'
Solution 2:[2]
img.decode("utf-8")
You can decode the variable with the above. Then convert it to base64.
"<img src='data:image/png;base64,{}'/>".format( base64.b64encode(img.decode("utf-8")) )
UPDATED:
What you really want is this:
raw_img = repr(img)
"<img src='data:image/png;base64,{}'/>".format( base64.b64encode(raw_img) )
Solution 3:[3]
Since you just need to convert the image to string why not just use str() function?
>>> img = b'\xff\xd8\xff\xe0\x00\x10JFIF\x00'
>>> type(img)
<class 'bytes'>
>>>
>>>raw_img = str(img)
>>> type(str(img))
<class 'str'>
>>>
img is in bytes, but when you use str() it is converted to type string.
An encoding can also be specified https://docs.python.org/3/library/stdtypes.html#str, which would be a more natural way to do things:
str(img, encoding='ansi')
As suggested in these answers
Solution 4:[4]
I'm pretty sure img is the byte string that you want to pass to base64.b64encode:
>>> import base64
>>> img = b'\xff\xd8\xff\xe0\x00\x10JFIF\x00'
>>> base64.b64encode(img)
b'/9j/4AAQSkZJRgA='
If you want to incorporate that into an HTML string, use
html = b'<img src="data:image/png;base64,' + base64.b64encode(img) + b' />'
Solution 5:[5]
I didn't solve this but here's some research on it(3Feb2022): This encoding is latin (or latin-1) and it's hard to print because Python wants to print it in another format. But for your case they should be the same. And for a data:image/png;base64 base64 code should be used.
My test code:
import codecs
img = b"\xff\xd8\xff\xe0\x00\x10JFIF\x00"
desired = "\xff\xd8\xff\xe0\x00\x10JFIF\x00"
str_decode = img.decode("latin-1")
str_decode_2 = str(img, "latin-1")
codecs_decode = codecs.decode(img, "latin-1")
print(desired.encode("latin-1") == img)
print(str_decode == desired)
print(str_decode == str_decode_2)
print(str_decode == codecs_decode)
print("desired:", repr(desired)) ##devprint
This gives 4 True and a desired: ÿØÿà\x00\x10JFIF\x00 with Python 3.10.
Solution 6:[6]
I've solved it (2022 - bit late to the party...)
If you try img_raw.decode() you get the
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte error
But if you leave img_raw as a binary string and pass it into b64encode and then decode it, it doesn't have the UnicodeDecodeError, and you can pass it in as a data string to your image tag.
base64.b64encode(raw_image).decode()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Primusa |
| Solution 2 | |
| Solution 3 | |
| Solution 4 | |
| Solution 5 | |
| Solution 6 | lizj |
