'How to not match a substring in any location of the main string


This might seem to be a repetitive question here but I have tried all other SO posts and the suggestions are not working for me.
Basically, I want to exclude strings that have a particular substring in them, either at the beginning, middle or at the end.

Here is an example,
Max_Num_HR, HR_Max_Num, Max_HR_Num
I want to exclude the strings that contain either _HR (at the end), HR_(at the beginning) or _HR_ (in between)

What I have tried so far:
r"(^((?!HR_).*))(?<!_HR)$"
This will successfully exclude strings that have HR_ (at the beginning) and _HR (at the end), but not _HR_ (in between)

I have looked at How to exclude a string in the middle of a RegEx string?
But their solution did not seem to work for me.

I understand that the first segment of my code (^((?!HR_).*)) will exclude everything that contains HR_ since I have a ^ at the beginning followed by a negative lookahead. The second segment (?<!_HR)$ will begin at the end of the string and perform a negative lookbehind to see if _HR is not included at the end. Going with this train of thought, I tried including (?!_HR_) in between the two segments, but to no avail.

So, how do I get it to exclude all three HR_, _HR_, _HR considering Max_Num_HR, HR_Max_Num, Max_HR_Num as the test case?



Solution 1:[1]

The pattern is missing the assertion for _HR_ somewhere in the string.

You can add the negative lookbehind to assert not _HR at the end after the dollar sign like $(?<!_HR) to prevent some backtracking over the .+

Note that for a match only you don't need the capture groups.

^(?!HR_)(?!.*_HR_).+$(?<!_HR)
  • ^ Start of string
  • (?!HR_) Assert not HR_ at the start
  • (?!.*_HR_) Assert not _HR_ in the string
  • .+$ Match 1+ chars to not match an empty string, and assert end of string
  • (?<!_HR) Assert not _HR to the left

Regex demo

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 The fourth bird