'Non Capturing group lazy when pattern repeats
I am trying to capture 1-2 groups in a line. If a line has a dash I want a group for before and a group for after the dash. If it does not then I would like 1 group of everything.
However, occasionally a line will start with 'Remove - ', which is a phrase I would like to ignore.
Example data:
| Strings |
| -------- |
| Remove - Precision Speed - Recap |
| Precision Speed - Recap |
| Remove - Precision Speed |
| Precision Speed |
The first two should each capture group 1: 'Precision Speed' AND group 2: 'Recap'. While the last two should only capture 1 group: 'Precision Speed'.
Right now I have ^(?:Remove - )?(.+)(?:\s*-\s*)(.*) and it is working correctly for the first two (because there is a second dash in there I believe). For the 3rd one it is capturing 'Remove' and 'Precision Speed' and for the 4th one it isn't capturing anything.
Solution 1:[1]
You may use the following pattern:
^(?:Remove - )?([^-]+)(?: - ([^-]+))?$
And if you're dealing with a multiline text, simply add \r\n to the negated character class to avoid matches across multiple lines:
^(?:Remove - )?([^-\r\n]+)(?: - ([^-\r\n]+))?$
Demo.
Solution 2:[2]
Make the second - and surrounding whitespace optional.
^(?:Remove - )?([^-]+)(?:\s*-\s*)?(.*)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | 41686d6564 stands w. Palestine |
| Solution 2 | Barmar |
