'Regex to match only the first occurrence of four numbers in a line
I would like to sort thousands of Bibliographic entries via RegEx. Every entry is build like this:
Lastname, Firstname. 1900. Title etc.
Now I need a RegEx to match 1900. This works:
[0-9]{4}
Unfortunately, some titles include more than one four digits group, for example:
Lastname, Firstname. 1900. Title: 1920-1930. etc.
But I want to match only four digit group (i.e. 1900 but not 1920 or 1930).
Any help would be appreciated!
Solution 1:[1]
Solution 2:[2]
Just simply use this:
\b\d{4}\b
It will match the first occurrence of 4 consecutive digits.
Solution 3:[3]
With this regex, you get only the first four numbers in the text regardless of parentheses.
^[^\d]*(\d{4})
Explanation:
The regex contains two parts. First part:
^match at the start of the string.[^\d]*it will match all non-number characters.
These two in combination will match all non-number characters until reach a number.
In the second part of the regex, with
()creates a group, then\d{4}it matches all digits with a length of four
The first and second part of the regex makes that, only the first four digits to be matched in a group. Sample here
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | |
| Solution 3 | stanimirsp |
