'Need to write regex for top-level web address which includes letters, numbers, underscores as well as periods, dashes, and a plus sign

The check_web_address function checks if the text passed qualifies as a top-level web address, meaning that it contains alphanumeric characters (which includes letters, numbers, and underscores), as well as periods, dashes, and a plus sign, followed by a period and a character-only top-level domain such as ".com", ".info", ".edu", etc. Fill in the regular expression to do that, using escape characters, wildcards, repetition qualifiers, beginning and end-of-line characters, and character classes.

import re

def check_web_address(text):
  pattern = ___
  result = re.search(pattern, text)
  return result != None

print(check_web_address("gmail.com")) # True
print(check_web_address("www@google")) # False
print(check_web_address("www.Coursera.org")) # True
print(check_web_address("web-address.com/homepage")) # False
print(check_web_address("My_Favorite-Blog.US")) # True

I have tried with this pattern but all sample input came true:

pattern = '[^/@][A-Za-z._-]*$'

What will be exact pattern to cover all above scenario?



Solution 1:[1]

Finally I have got this way to cover all above scenarios with below code,

import re
def check_web_address(text):
  pattern = r'^[A-Za-z._-][^/@]*$'
  result = re.search(pattern, text)
  return result != None

print(check_web_address("gmail.com")) # True
print(check_web_address("www@google")) # False
print(check_web_address("www.Coursera.org")) # True
print(check_web_address("web-address.com/homepage")) # False
print(check_web_address("My_Favorite-Blog.US")) # True

Solution 2:[2]

This is working fine

  pattern = r"^\w.*\.[a-zA-Z]*$"

Solution 3:[3]

The pattern that I used to accomplish this was:

pattern = "^[\w]*[\.\-\+][^/@]*$"

Solution 4:[4]

I tried with this pattern:

pattern = "\.[a-zA-Z]+$"

This should work too. It checks if result contains dots followed by one or more occurence of upper of lower case alphabets at the end.

Solution 5:[5]

pattern = r"^[\w|\.\-\_]+\.[a-zA-Z]+$"

Solution 6:[6]

import re
def check_web_address(text):
  pattern = r'^[\w\-+.]+\.[a-zA-z]+$'
  result = re.search(pattern, text)
  return result != None

print(check_web_address("gmail.com")) # True  
print(check_web_address("www@google")) # False.  
print(check_web_address("www.Coursera.org")) # True
print(check_web_address("web-address.com/homepage")) # False.    
print(check_web_address("My_Favorite-Blog.US")) # True

Explanations

  1. ^[\w\-+.]+ => beginning of the expression should start with word,+,- . eg: www.Coursera or www.89+- so on and can at-least have one character matching so '+' at the end
  2. \. => simple to catch middle section of pattern www.somedomain.
  3. [a-zA-z]+$ => matchs .in or .IN simple pattern expression because domain are simple without any special characters

Hopes it helps :) Happy Stacking

Solution 7:[7]

I did it this way

import re
def check_web_address(text):
  pattern = r'^[A-Za-z0-9-_+.]*[.][A-Za-z]*$'
  result = re.search(pattern, text)
  return result != None

print(check_web_address("gmail.com")) # True
print(check_web_address("www@google")) # False
print(check_web_address("www.Coursera.org")) # True
print(check_web_address("web-address.com/homepage")) # False
print(check_web_address("My_Favorite-Blog.US")) # True

^ start

[A-Za-z0-9-_+.]* repetition qualifiers of alphanumeric characters (which includes letters, numbers, and underscores), as well as periods, dashes, and a plus sign

[.] followed by a period

[A-Za-z]* a character-only top-level domain such as ".com", ".info", ".edu", etc

$ End

Solution 8:[8]

import re
def check_web_address(text):
  pattern = r'\.[comeduorginfoUSnetintmilgov]*$'
  result = re.search(pattern, text)
  return result != None

print(check_web_address('gmail.com')) # True
print(check_web_address('www.google')) # False
print(check_web_address('www.Coursera.org')) # True
print(check_web_address('web-address.com/homepage')) # False
print(check_web_address('My_Favorite-Blog.US')) # True

Solution 9:[9]

This Should Work fine:

pattern = r".*\.[A-Za-z]{1,3}.$"

In One long sentence:

Here we filter any number of character(.*) followed by a period(.) then check for 2 or 3 ending character capital or small ([A-Za-z]{1,3}.$) Here:

.* : Accepts any Number of any character.

\. : Backslash '' is escape character so it is only period('.')

[A-Za-z] : character class [A-Za-z] means Accepts capital and small Alphabets,

{1,3} : to limit above ([A-Za-z]) character between 1 & 3 (excluding 1 including 3)

. : usually it means any One single character but with {1,3} it accepts the provided number of character.

$ : means the string should end with

In One long sentence:

Here we filter any number of character(.*) followed by a period(.) then check for 2 or 3 ending character capital or small ([A-Za-z]{1,3}.$)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Vinod
Solution 2 Divya Pateriya
Solution 3 Liam
Solution 4 unacorn
Solution 5 cigien
Solution 6 hemant singh
Solution 7 Haris
Solution 8 LauraL
Solution 9