'Check if a user input has https and include it if it doesn't
I've been developing a webscrapper to a certain website and my goal is to let users just input the link of a page of this website to return some information.
Sometimes, users will copy/paste the link without "https://" and my app returns an error.
So, I tought this to avoid the error:
url = input("Your link")
if 'https://' in url:
url
else:
url = 'https://'+url
It works, but I wonder if it is the best solution to check just the https part.
Also, how could I check if it's including a valid link? Like, check if the link starts with https:// certaindomain
Solution 1:[1]
You can use urllib.parse.urlparse() using .scheme to get whether the URL is over HTTPS, and using netloc to determine whether the website name is correct.
It's not immediately clear what output you're expecting for URLs that don't use HTTPS, so this code snippet is mostly to demonstrate how to get the scheme, and I'll let you customize the error handling to your own situation:
from urllib.parse import urlparse
url = input("Your link")
parse_result = urlparse(url)
print(parse_result.scheme)
print(parse_result.netloc)
If you pass in:
https://github.com/
The program will output:
https
github.com
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | BrokenBenchmark |
