'Python Regular Expression pattern r'(.*[0-9]){5,}'

I thought the braces require 5 matches.

import re

rMatch = re.search(r'(.*[0-9]){5,}', 'A1B2C3D4E5')
print(rMatch)
print(rMatch.groups())

Why only one group? How can I see the 5 matches that are taking place given the use of curly braces?



Solution 1:[1]

Here the curly braces state that (.*[0-9]) must be matched 5 times. You can not get the individual matches using curly braces, but you can see a progression. You can also see that if you try 6 matches the search fails.

rMatch = re.search(r'(.*[0-9]){1}', 'A1')
print(rMatch)
rMatch = re.search(r'(.*[0-9]){2}', 'A1@2')
print(rMatch)
rMatch = re.search(r'(.*[0-9]){3}', 'A1@2C3')
print(rMatch)
rMatch = re.search(r'(.*[0-9]){4}', 'A1@2C3D4')
print(rMatch)
rMatch = re.search(r'(.*[0-9]){5}', 'A1@2C3D4E5')
print(rMatch)
rMatch = re.search(r'(.*[0-9]){6}', 'A1@2C3D4E5')
print(rMatch)

Output:

<re.Match object; span=(0, 2), match='A1'>
<re.Match object; span=(0, 4), match='A1@2'>
<re.Match object; span=(0, 6), match='A1@2C3'>
<re.Match object; span=(0, 8), match='A1@2C3D4'>
<re.Match object; span=(0, 10), match='A1@2C3D4E5'>
None

Solution 2:[2]

A more correct pattern would be (?:[A-Z]+[0-9]+){5,}:

rMatch = re.search(r'(?:[A-Z]+[0-9]+){5,}', 'A1B2C3D4E5')
if rMatch:
    print("MATCH")

If you want to actually capture the 5 number alpha matches, then re.findall works better:

inp = "A1B2C3D4E5"
matches = re.findall(r'[A-Z]+[0-9]+', inp)
print(matches)  # ['A1', 'B2', 'C3', 'D4', 'E5']

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2