'Regex to capture exact numbers in string

I have a column that looks like this:

11/33/4500030050
4100000300/4503134501
4100030300+4503114501
11

The regular expression should capture the following:

4500030050
4100000300/4503134501
4100030300+4503114501
''

Here's my current regular expression:

 col.str.findall(r'[/+ #_;.-]?(?<![0-9])[0-9]{10}(?![0-9])').str.join('').str.lstrip('/+ #_;.-')

This however captures all numbers that have 10 digits. How can I modify so that it can only capture numbers that start with 41 and 45?



Solution 1:[1]

You can use

[+ #_;.-]?(?<![0-9])4[15][0-9]{8}(?![0-9])

See the regex demo.

Details:

  • [+ #_;.-]? - an optional +, space, #, _, ;, . or -
  • (?<![0-9]) - left digit boundary
  • 4 - a 4 digit char
  • [15] - 1 or 5
  • [0-9]{8} - eight digits
  • (?![0-9]) - right digit boundary.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Wiktor Stribiżew