'Combine negated characters and capture groups
How it works currently
I am able to capture the values between the brackets:
[[two b][three c]]
The result is
two b
three c
The RegEx for that
\[\[(.+?)\]\[(.+?)\]\]
When I use this string
[[one a]]
Nothing is captured and that is how I expect it. Fine.
The problem
I combine the strings
[[one a]] and [[two b][three c]]
This is captured
one a]] and [[two b
three c
What I understand
In my understanding there a possible approaches could be to negate the ]] string. But I don't know how to do this. And I am not sure if this is the right approach.
Solution 1:[1]
The . char matches any char other than line break chars, and the fact it is quantified with a lazy quantifier does not restrict it from matching basically any char (the matches are searched for from left to right, thus, [[ matched is the leftmost [[ and the next ][ is matched regardless if there was a [ or ] in between.
So, one approach is to exclude any square brackets between [[ and ][ using a negated character class [^\]\[]:
\[\[([^\]\[]+)\]\[([^\]\[]+)\]\]
See the regex demo. Here, [^\]\[]+ that replaced .+? match one or more chars other than [ and ].
Another approach is the one you mention, namely, match any chars that do not start [[ (and probably ]], too) before ][:
\[\[((?:(?!\[\[).)*?)\]\[(.*?)\]\]
\[\[((?:(?!\[\[|\][\]\[]).)*)\]\[(.*?)\]\]
See this regex demo.
The (?:(?!\[\[).)*? part matches any char (.), zero or more but as few as possible occurrences (*?), that does not start a [[ char sequence ((?!\[\[)).
The (?:(?!\[\[|\][\]\[]).)* part matches any char (.), zero or more but as many as possible occurrences (*), that does not start a [[, [[ or ][ char sequences ((?!\[\[|\][\]\[])).
Depending on the regex flavor, you can get rid of some backslashes in this regex.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Wiktor Stribiżew |
