'RegEx capture group where delimiter can appear in the group

I need a regex that will capture the following:

foo-bar-fort-worth-tx

1st group: (foo-bar) 2nd group: (fort-worth) 3rd group: (tx)

I'm having trouble since the delimiter '-' can also appear in the capture group. Luckly, I think there will be at most '-' in each capture group.

Here's my regex:

^(.+)-(.+)-[a-zA-Z]{2}$

However, it's not working as expected. Any help would be much appreciated.



Solution 1:[1]

There are a few errors in your pattern. When you use a capturing group such as (.+), the regex will try and match (and capture) anything (.) as much as possible due to the greediness of +. See here for the pattern and check the debugger too. As it matches anything, it will match up until the end of the string. The pattern then asks for a -, so the regex has to backtrack until it finds the -, and then stops. Hence you end up with:

Group 1.    0-18    `foo-bar-fort-worth` 

Use the following expression which uses negated character sets to match and capture the patterns you described:

^([^-]+-[^-]+)-([^-]+-[^-]+)-(.*)$

You can try it here.

Group 1.    0-7     `foo-bar`
Group 2.    8-18    `fort-worth`
Group 3.    19-21   `tx`

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1