'OSError: cannot identify image file <_io.BytesIO object at 0x02F41960> while trying to reuse downloaded header
My project is in Python 3.6.2. I'm trying to identify whether images are worth downloading at all (if they have a certain aspect ratio) by reading only the header (first ~100 bytes of the online file), so far just testing with imghdr and Pillow.
Image.open fails at the end with:
File "C:\Program Files (x86)\Python36-32\lib\site-packages\PIL\Image.py", line 2349, in open % (filename if filename else fp))
OSError: cannot identify image file <_io.BytesIO object at 0x02F41960>
I found Release notes 2.8.0 for Pillow which seemed to suggest I'd be able to use Image.open(requests.raw). I guessed I should be able to reuse the already-downloaded header after ensuring I reset it with seek(0).
Other answers with this error seem to deal with saving the image buffer to an actual file, which I am trying to avoid (just reusing the downloaded bytes from response.raw for all my test/checks, and not making multiple download requests to any server.)
Where am I going wrong please?
Here is my sample code:
import requests
from PIL import Image
import imghdr
import io
if __name__ == '__main__':
url = "https://ichef-1.bbci.co.uk/news/660/cpsprodpb/37B5/production/_89716241_thinkstockphotos-523060154.jpg"
try:
response = requests.get(url, stream=True)
if response.status_code == 200:
response.raw.decode_content = True
# Grab first 100 bytes as potential image header
header = response.raw.read(100)
ext = imghdr.what(None, h=header)
print("Found: " + ext)
if ext != None: # Proceed to other tests if we received an image at all
header = io.BytesIO(header)
header.seek(0)
im = Image.open(header)
im.verify()
# other image-related tasks here
else:
print("Received error " + str(response.status.code))
except requests.ConnectionError as e:
print(e)
Solution 1:[1]
You have to get the rest of the image data before calling Image.open().
This is what I mean:
import requests
from PIL import Image
import imghdr
import io
if __name__ == '__main__':
url = "https://ichef-1.bbci.co.uk/news/660/cpsprodpb/37B5/production/_89716241_thinkstockphotos-523060154.jpg"
try:
response = requests.get(url, stream=True)
if response.status_code == 200:
response.raw.decode_content = True
# Grab first 100 bytes as potential image header
header = response.raw.read(100)
ext = imghdr.what(None, h=header)
print("Found: " + ext)
if ext != None: # Proceed to other tests if we received an image at all
data = header + response.raw.read() # GET THE REST OF THE FILE
data = io.BytesIO(data)
im = Image.open(data)
im.verify()
# other image-related tasks here
else:
print("Received error " + str(response.status.code))
except requests.ConnectionError as e:
print(e)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | martineau |
