'Can not get response from urllib.request.urlopen with an url ending with a dot
I have a script looking like this, with a username ending with a "." dot.
import urllib.request
url = "https://likee.video/@evadecarle."
response = urllib.request.urlopen(url)
print(response)
The ending dot "." in the url seems to cause a problem.
If I change the url to url = "https://likee.video/@11Happyness07.12" it works fine.
How do I make it work with the ending dot "." ?
Solution 1:[1]
If we try to fetch https://likee.video/@evadecarle. using urllib.requests, we see:
>>> import urllib.request
>>> response = urllib.request.urlopen('https://likee.video/@evadecarle.')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python3.10/urllib/request.py", line 216, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib64/python3.10/urllib/request.py", line 525, in open
response = meth(req, response)
File "/usr/lib64/python3.10/urllib/request.py", line 634, in http_response
response = self.parent.error(
File "/usr/lib64/python3.10/urllib/request.py", line 563, in error
return self._call_chain(*args)
File "/usr/lib64/python3.10/urllib/request.py", line 496, in _call_chain
result = func(*args)
File "/usr/lib64/python3.10/urllib/request.py", line 643, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 302: Moved Temporarily
>>>
It's failing because the remote website is returning a 302 status code
(an http redirect). Normally, you would handle this by using an
HTTPRedirectHandler, something like:
>>> opener = urllib.request.build_opener(urllib.request.HTTPRedirectHandler(), urllib.request.HTTPHandler(debuglevel=0))
>>> resp = opener.open('https://google.com')
>>> resp.url
'https://www.google.com/'
Unfortunately, the URL https://likee.video/@evadecarle. is an odd
one: it returns a 302 status code, but doesn't include a Location:
header identifying the redirect target.
Because of this, it looks like urllib doesn't handle it properly.
Someone else may correct me on this, but it looks like the requests
library handles this without a problem:
>>> resp = requests.get('https://likee.video/@evadecarle.')
>>> resp
<Response [302]>
>>> resp.text[:80]
'<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"><meta name="robots" c'
So using the requests module may be the simplest solution.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | larsks |
