'RegExp Match all text parts except given words
I have a text
and I need to match all text parts except given words with regexp
For example if text is ' Something went wrong and I could not do anything '
and given words are 'and'
and 'not'
then the result must be ['Something went wrong', 'I could', 'do anything']
Please don't advise me to use string.split()
or string.replace()
and etc. I know a several ways how I can do this with build-in methods. I'm wonder if there a regex which can do this, when I will execute text.math(/regexp/g)
Please note that the regular expression must work at least in Chrome, Firefox and Safari versions not lower than the current one by 3! At the moment of asking this question the actual versions are 100.0, 98.0.2 and 15.3 respectively. For example you can not use lookbehind feature in Safari
Please, before answering my question, go to https://regexr.com/ and check your answer!. Your regular expression should highlight all parts of a sentence, including spaces between words of need parts and except empty spaces around need parts, except for the given words
Before asking this question I tried to do my own search but this links didn't help me. I also tried non accepted answers:
Match everything except for specified strings
Regex: match everything but a specific pattern
Regex to match all words except a given list
Regex to match all words except a given list (2)
Need to find a regular expression for any word except word1 or word2
Solution 1:[1]
See Edit further down.
You can use this regex, which only use look ahead:
/(?!and|not)\b.*?(?=and|not|$)/g
Explanation:
(?!and|not)
- negative look ahead for and
or not
\b
- match word boundary, to prevent matching nd
and ot
.*?
- match any char zero or more times, as few as possible
(?=and|not|$)
- look ahead for and
or not
or end of text
If your text has multiple lines you can add the m
flag (multiline). Alternatively you can replace dot (.
) with [\s\S]
.
Edit:
I have changed it a little so spaces around the forbidden words are removed:
/(?!and|not)\b\w.*?(?= and| not|$)/g
I have added a \w
character match to push the start of the match after the space and added spaces in the look ahead.
Edit2: (to handle multiple spaces around words):
You were very close! All you need is a \s*
before the dollar sign and specified words:
/(?!and|not|\s)\b.*?(?=\s*(and|not|$))/g
Updated link: regexr.com
Solution 2:[2]
It's possible with only using match and lookaheads in javascript.
/\b(?=\w)(?!(?:and|not)\b).*?(?=\s+(?:and|not)\b|\s*$)/gi
Test on RegExr here
Basically match the start of a word that's not a restricted word\b(?=\w)(?!(?:and|not)\b)
Then a lazy match till the next whitespaces and restricted word, or the end of the line without including last whitespaces..*?(?=\s+(?:and|not)\b|\s*$)
Test Snippet :
const re = /\b(?=\w)(?!(?:and|not)\b).*?(?=\s+(?:and|not)\b|\s*$)/gi
let str = ` Something went wrong and I could not do anything `;
let arr = str.match(re);
console.log(arr);
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | EzioMercer |
Solution 2 |