'Non Capturing group lazy when pattern repeats

I am trying to capture 1-2 groups in a line. If a line has a dash I want a group for before and a group for after the dash. If it does not then I would like 1 group of everything.

However, occasionally a line will start with 'Remove - ', which is a phrase I would like to ignore.

Example data:

| Strings |
| -------- |
| Remove - Precision Speed - Recap |
| Precision Speed - Recap |
| Remove - Precision Speed |
| Precision Speed |

The first two should each capture group 1: 'Precision Speed' AND group 2: 'Recap'. While the last two should only capture 1 group: 'Precision Speed'.

Right now I have ^(?:Remove - )?(.+)(?:\s*-\s*)(.*) and it is working correctly for the first two (because there is a second dash in there I believe). For the 3rd one it is capturing 'Remove' and 'Precision Speed' and for the 4th one it isn't capturing anything.



Solution 1:[1]

You may use the following pattern:

^(?:Remove - )?([^-]+)(?: - ([^-]+))?$

And if you're dealing with a multiline text, simply add \r\n to the negated character class to avoid matches across multiple lines:

^(?:Remove - )?([^-\r\n]+)(?: - ([^-\r\n]+))?$

Demo.

Solution 2:[2]

Make the second - and surrounding whitespace optional.

^(?:Remove - )?([^-]+)(?:\s*-\s*)?(.*)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 41686d6564 stands w. Palestine
Solution 2 Barmar