'Check if a user input has https and include it if it doesn't

I've been developing a webscrapper to a certain website and my goal is to let users just input the link of a page of this website to return some information.

Sometimes, users will copy/paste the link without "https://" and my app returns an error.

So, I tought this to avoid the error:

url = input("Your link")

if 'https://' in url:
  url
else:
  url = 'https://'+url

It works, but I wonder if it is the best solution to check just the https part.

Also, how could I check if it's including a valid link? Like, check if the link starts with https:// certaindomain



Solution 1:[1]

You can use urllib.parse.urlparse() using .scheme to get whether the URL is over HTTPS, and using netloc to determine whether the website name is correct.

It's not immediately clear what output you're expecting for URLs that don't use HTTPS, so this code snippet is mostly to demonstrate how to get the scheme, and I'll let you customize the error handling to your own situation:

from urllib.parse import urlparse

url = input("Your link")

parse_result = urlparse(url)

print(parse_result.scheme)
print(parse_result.netloc)

If you pass in:

https://github.com/

The program will output:

https
github.com

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 BrokenBenchmark