'Null value and whitespace in Regex

I have the following Regex:

^(?P<port_name>[\w\d\/]+)[^\S\r\n]+(?P<description>[\S]+ ?[\S]+)\s+(?P<duplex>[\w]+)\s+

That I use on the following data:

>show interfaces status
Port      Description     Duplex Speed   Neg  Link   Flow  M  VLAN
                                              State  Ctrl
--------- --------------- ------ ------- ---- ------ ----- -- -------------------
Te1/0/1   CVH10 Mgt+Clstr Full   10000   Off  Up     On    T  (1),161-163
Te1/0/2   CVH10 VM 1      Full   10000   Off  Up     On    T  (1),11,101,110,
                                                              120,130,140,150,
                                                              160,170,180,190,
                                                              200,210,230,240,
                                                              250,666,999
Fo2/1/1                   N/A    N/A     N/A  Detach N/A
Te2/0/8                   Full   10000   Off  Down   Off   A  1

Which gives me this result (https://regex101.com):

enter image description here

I can't seem to figure out:

  • How to match the description without also matching the duplex value of Full, because of it's spaces.
  • How to match a value of null when there is no discription avalible as seen in the last two lines.

I'm starting to learn Regex and hope someone can help me out.



Solution 1:[1]

You can not assume that the columns are divided by at least 2 whitespace chars, so in that case you can make the pattern a bit more specific.

Note that \w also matches \d.

^(?P<port_name>[\w\/]+)[^\S\r\n]+(?P<description>(?!Full\b|N\/A\b)\S+(?:[^\S\r\n]+\S+)*?[^\S\r\n]+)?(?P<duplex>Full|N\/A)[^\S\r\n]+(?P<speed>\d+|N\/A)\b
  • ^ Start of string
  • (?P<port_name>[\w\/]+)[^\S\r\n]+ port_name match 1+ word chars or -
  • (?P<description> description
    • (?!Full\b|N\/A\b) Negative lookahead, assert not Full or N/A directly to the right
    • \S+(?:[^\S\r\n]+\S+)*? Match 1+ non whitespace chars and optionally repeat 1+ spaces and 1+ non whitespace chars, as least as possible
    • [^\S\r\n]+ Match 1+ spaces (whitespace chars without a newline)
  • )? Close the group and make it optional
  • (?P<duplex>Full|N\/A)[^\S\r\n]+ duplex Match either Full or N/A
  • (?P<speed>\d+|N\/A)\b speed Match either 1+ digits or N/A

See a regex 101 demo.

To match all values including optional fields:

^(?P<port_name>[\w\/]+)[^\S\r\n]+(?P<description>(?!Full\b|N\/A\b)\S+(?:[^\S\r\n]+\S+)*?)?\s+(?P<duplex>Full|N\/A)\b\s+(?P<speed>[\d\w\/]+)\s+(?P<neg>[\w\/]+)\s+(?P<link_state>[\w]+)\s+(?P<flow_control>[\w\/]+)(?:(?P<mode>[^\S\r\n]+\w+)(?:[^\S\r\n]+(?P<vlans>[\d(),-]+(?:\r?\n[^\S\r\n]+[\d(),-]+)*))?)?

See another regex 101 demo.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 The fourth bird