'(*SKIP)(*FAIL) workaround in JavaScript RegExp
I have a regex pattern that works fine in regex101.com: ~<a .*?">(*SKIP)(*FAIL)|\bword\b
I am trying to make it a Regexp so it can be used in the replace() function in JavaScript.
The line of JavaScript code is:
var regex = new RegExp("~<a.*?\">(*SKIP)(*FAIL)|\\b"+ word + "\\b", 'g');
Where word is the word I'm trying to match.
When I run it though, the console shows the following error:
Uncaught (in promise) SyntaxError: Invalid regular expression:
/~<a.*?">(*SKIP)(*FAIL)|word/: Nothing to repeat
Am I escaping characters wrong?
I tried backslash-escaping every special character I could find (?, *, < and so on) in my JavaScript code and it still spat out that error.
Solution 1:[1]
You can work around the missing (*SKIP)(*FAIL) support in JavaScript using capturing groups in the pattern and a bit of code logic.
Note the (*SKIP)(*FAIL) verb sequence is explained in my YT video called "Skipping matches in specific contexts (with SKIP & FAIL verbs)". You can also find a demo of JavaScript lookarounds for four different scenarions: extracting, replacing, removing and splitting.
Let's adjust the code for the current question. Let's assume word always consists of word characters (digits, letters or underscores).
- Extracting: Capture the word into Group 1 and only extract Group 1 values:
const text = `foo <a href="foo.com">foo</a> foobar`;
const word = 'foo';
const regex = new RegExp(String.raw`<a .*?">|\b(${word})\b`, 'gi');
console.log(Array.from(text.matchAll(regex), x=>x[1]).filter(Boolean)); // => 1st word and `>foo<`
- Removing: Capture the context you need to keep into Group 1 and replace with a backreference to this group:
const text = `foo <a href="foo.com">foo</a> foobar`;
const word = 'foo';
const regex = new RegExp(String.raw`(<a .*?">)|\b${word}\b`, 'gi');
console.log(text.replace(regex, '$1')); // => <a href="foo.com"></a> foobar
- Replacing: Capture the context you need to keep into Group 1 and when it is used, replace with Group 1 value, else, replace with what you need in a callback function/arrow function used as the replacement argument:
const text = `foo <a href="foo.com">foo</a> foobar`;
const word = 'foo';
const regex = new RegExp(String.raw`(<a .*?">)|\b${word}\b`, 'gi');
console.log(text.replace(regex, (match, group1) => group1 || 'buz' ));
// => buz <a href="foo.com">buz</a> foobar
- Splitting: This is the most intricate scenario and it requires a bit more coding:
const text = `foo <a href="foo.com">foo</a> foobar`;
const word = 'foo';
const regex = new RegExp(String.raw`(<a .*?">)|\b${word}\b`, 'gi');
let m, res = [], offset = 0;
while (m = regex.exec(text)) { // If there is a match and...
if (m[1] === undefined) { // if Group 1 is not matched
// put the substring to result array
res.push(text.substring(offset, m.index)) // Put the value to array
offset = m.index + m[0].length // Set the new chunk start position
}
}
if (offset < text.length) { // If there is any more text after offset
res.push(text.substr(offset)) // add it to the result array
}
console.log(res);
// => ["", " <a href=\"foo.com\">", "</a> foobar"]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Wiktor Stribiżew |
