'Regex to capture optional characters

I want to pull out a base string (Wax) from a longer string, along with some data before and after. I'm having trouble getting the last item in my list below (noWax) to match.

Can anyone flex their regex muscles? I'm fairly new to regex so advice on optimization is welcome as long as all matches below are found.

What I'm working with in Regex101:


/(?<Wax>Wax(?:Only|-?\d+))/mg

Original string need to extract in a capturing group
Loc3_341001_WaxOnly_S212 WaxOnly
Loc4_34412-a_Wax4_S231 Wax4
Loc3a_231121-a_Wax-4-S451 Wax-4
Loc3_34112_noWax_S311 noWax


Solution 1:[1]

Here is one way to do so, using a conditional:

(?<Wax>(no)?Wax(?(2)|(?:Only|-?\d+)))

See the online demo.


  • (no)?: Optional capture group.
  • (? If.
    • (2): Test if capture group 2 exists ((no)). If it does, do nothing.
    • |: Or.
    • (?:Only|-?\d+)

Solution 2:[2]

I assume the following match is desired.

  • the match must include 'Wax'
  • 'Wax' is to be preceded by '_' or by '_no'. If the latter 'no' is included in the match.
  • 'Wax' may be followed by:
    • 'Only' followed by '_', in which case 'Only' is part of the match, or
    • one or more digits, followed by '_', in which case the digits are part of the match, or
    • '-' followed by one or more digits, followed by '-', in which case '-' followed by one or more digits is part of the match.

If these assumptions are correct the string can be matched against the following regular expression:

(?<=_)(?:(?:no)?Wax(?:(?:Only|\d+)?(?=_)|\-\d+(?=-)))

Demo

The regular expression can be broken down as follows.

(?<=_)            # positive lookbehind asserts previous character is '_'
(?:               # begin non-capture group
  (?:no)?         # optionally match 'no'
  Wax             # match literal
  (?:             # begin non-capture group
    (?:Only|\d+)? # optionally match 'Only' or >=1 digits
    (?=_)         # positive lookahead asserts next character is '_'
    |             # or
    \-\d+         # match '-' followed by >= 1 digits
    (?=-)         # positive lookahead asserts next character is '-'
  )               # end non-capture group
)                 # end non-capture group

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Cary Swoveland