'Regular Expression to matching against multiple ranges of different lengths

I need to validate if the value a user enters is one of the following:

  • a 7 digit number ranging from 7000000 - 7999999
  • a 9 digit number ranging from 777000000 - 777777777
  • an 11 digit number ranging from 77700000000 - 77777777777

What I have so far and that seems to work is: ^7\d{7}|777\d{7}|777\d{9}$

However, since I'm new at regex, I wanted to confirm if this is the most efficient way.

Thanks in advance!



Solution 1:[1]

Incorrect

  1. Since | has lower precedence that concatenation, your pattern is parsed as:

    ^7\d{7}
    |
    777\d{7}
    |
    777\d{9}$
    

    The anchors only apply to the first and the last sub-pattern.

  2. Another thing is "7-digit number ranging from 7000000 - 7999999" means that it has 6 free digits, not 7 as specified 7\d{7}. Same off-by-one error for the other sub-patterns.

  3. Since you disallow 777999999 in your range 777000000 - 777777777, your regex won't be as convenient as 777\d{6} (if you want to do everything with regex).

    For matching the range 777000000 - 777777777, you need:

    777(?:[0-6]\d{5}|7(?:[0-6]\d{4}|7(?:[0-6]\d{3}|7(?:[0-6]\d{2}|7(?:[0-6]\d|7[0-7])))))
    

    It matches the prefix 777, then the suffixes (in order):

    • 000000 - 699999
    • 700000 - 769999
    • 770000 - 776999
    • 777000 - 777699
    • 777700 - 777769
    • 777770 - 777777

    (I assume that you want to match 777000009)

Solution

Fixing the first and second problem is easy, just non-capturing group (?:pattern) and adjust the number of repetitions:

^(?:7\d{6}|777\d{6}|777\d{8})$

However, the 3rd problem is not easy to resolve with regex alone (possible, but you will end up with a mess of code).

As suggested, parsing the text into numbers and work with it would be easier. 11-digit number can fit into 64-bit integer type (use integer type if possible), or double-precision floating point (if you are working with JavaScript, Number are represented by double-precision floating point, which has 53-bit precision).

Solution 2:[2]

This is not a problem for regexes. Regexes are for matching patterns, not evaluating numeric values.

Get the user input and then compare it to the numerical values.

$ok =
    ($n >= 7000000 && $n <= 7999999) 
    ||
    ($n >= 777000000 && $n <= 777777777)
    ||
    ($n >= 77700000000 && $n <= 77777777777);

Regexes are not a magic wand you wave at every problem that happens to involve strings.

Finally, don't worry about "most efficient" until you have "works correctly."

Solution 3:[3]

Your best bet is to capture the full digit string with ^(7\d{6}|777\d{6}|777\d{8})$ and then evaluate it as a number (this assumes that the 7, 9 or 11 digit number is the entire input).

Solution 4:[4]

Wouldn't this work?

^(?=7[0-9]{6}$|777[0-7]{6}$|777[0-7]{8}$).+

Solution 5:[5]

Doing something like that is not trivial because the allowed digits on place [n] are dependent on the digit on the place [n-1].

Consider range 7000-777.

  • First digit can be only 7
  • Second can be [0-6] and other [0-9] or
    • Second is 7 and:
      • Third is [0-6] and fourth [0-9] or
        • Third is 7 and fourth is [0-7]

This will get something like that:

7(([0-6][0-9]{2})|(7(([0-6][0-9])|(7[0-7]))))

Let's test it (JavaScript):

var regexStr = '7(([0-6][0-9]{2})|(7(([0-6][0-9])|(7[0-7]))))'
var str = ['7008','7777','7798','7584','7689','7784']
for (var i=0;i<str.length;i++){
var regex = new RegExp(regexStr)
  console.log(regex.exec(str[i]))
}

The more digits you have, the bigger the regex would be. I can't find any other way to implemet integer comparison in regex...

Solution 6:[6]

Given that floating numbers and conditionals don't mix very well,
here is a regex that should cover all the target ranges :

^(?:7\d{6}|(?:777[0-6]\d{7}|7777[0-6]\d{6})|(?:(?:77)?(?:777[0-6]\d{5}|7777[0-6]\d{4}|77777[0-6]\d{3}|777777[0-6]\d{2}|7777777[0-6]\d|77777777[0-7])))$

Expanded

 ^ 
 (?:
      7 \d{6} 
   |  

      (?:
           777 [0-6] \d{7} 
        |  7777 [0-6] \d{6} 
      )
   |  
      (?:
           (?: 77 )?
           (?:
                777 [0-6] \d{5} 
             |  7777 [0-6] \d{4} 
             |  77777 [0-6] \d{3} 
             |  777777 [0-6] \d{2} 
             |  7777777 [0-6] \d 
             |  77777777 [0-7] 
           )
      )
 )
 $

Solution 7:[7]

This is fairly easily handled using negative lookahead. For the 11-digit case we can use the following regular expression:

^(?!7+[89])(?<Number>777[\d]{8})$

This is easily extended to include the other cases.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 nhahtdh
Solution 2
Solution 3 HamZa
Solution 4 Eugene
Solution 5 Danubian Sailor
Solution 6
Solution 7 M Kloster