'RegEx capture group where delimiter can appear in the group
I need a regex that will capture the following:
foo-bar-fort-worth-tx
1st group: (foo-bar) 2nd group: (fort-worth) 3rd group: (tx)
I'm having trouble since the delimiter '-' can also appear in the capture group. Luckly, I think there will be at most '-' in each capture group.
Here's my regex:
^(.+)-(.+)-[a-zA-Z]{2}$
However, it's not working as expected. Any help would be much appreciated.
Solution 1:[1]
There are a few errors in your pattern. When you use a capturing group such as (.+), the regex will try and match (and capture) anything (.) as much as possible due to the greediness of +. See here for the pattern and check the debugger too. As it matches anything, it will match up until the end of the string. The pattern then asks for a -, so the regex has to backtrack until it finds the -, and then stops. Hence you end up with:
Group 1. 0-18 `foo-bar-fort-worth`
Use the following expression which uses negated character sets to match and capture the patterns you described:
^([^-]+-[^-]+)-([^-]+-[^-]+)-(.*)$
You can try it here.
Group 1. 0-7 `foo-bar`
Group 2. 8-18 `fort-worth`
Group 3. 19-21 `tx`
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
