'Exclude some characters in string
In quotes I try to exclude the ; characters using regex in Node.js.
For example i have an sql like below:
update codes set character='.' where character = ';' and write_date < now()-5 ;
I want to find effected rows before execute the statement. I wrote below regex but it not work correctly if there is ; character in quotes as above
const regexp = /update\s+((.|\n)[^;]*?)\s+set\s+((.|\n)[^;]*?)\s+where\s+((.|\n)[^;]*?);/ig;
regexp.exec(str)
Expected output:
table: codes
where: character = ';' and write_date < now()-5
But I get:
table: codes
where: character = ';
Solution 1:[1]
You can use
update\s+([^;]*?)\s+set\s(?:[^;]*?\s)?where\s+((?:'[^']*'|[^;])*?)\s*;
See the regex demo. Details:
update- a word\s+- one or more whitespaces([^;]*?)- Group 1: zero or more but as few as possible chars other than;\s+- one or more whitespacesset- a word\s- a whitespace(?:[^;]*?\s)?- an optional sequence of any chars other than;as few as possible, and then a whitespacewhere- a word\s+- one or more whitespaces((?:'[^']*'|[^;])*?)- Group 2: zero or more (as few as possible) sequences of', zero or more non-'s, and then', or any single char other than a;\s*- zero or more whitespaces;- a;char.
Solution 2:[2]
First of, I'm not sure what (.|\n) is for, so I'm ignoring that.
I believe there are two problems with your regexp, changing either will probably solve your problem, but I'd change both, just to be sure.
The
?after the*makes the*non-greedy, which means the regex will match as little as possible, so that the final;in the regexp will match the first possible;it finds, not the last possible. So I'd leave the?out.The regexp doesn't use
$to anchor to the end of string. Add$after;at the end (possibly\s*$if you expect additional white space at the end of the string). If you do this, you actually don't need to exclude;. And it may be a good idea, to add^(or^\s*) at the beginning to anchor to the beginning of the string, too.
So the resulting regexp is
const regexp = /^\s*update\s+((.|\n).*)\s+set\s+((.|\n).*)\s+where\s+((.|\n).*);\s*$/ig;
Finally some conceptional ideas: Why are you doing this in the first place? Instead of starting with the UPDATE SQL, why don't you start out with the structure:
{
table: "codes",
where: "character = ';' and write_date < now()-5"
}
and build both the UPDATE and the SELECT SQLs from that?
Or if you only have the UPDATE SQL, instead of using a regular expression, there are SQL parser libraries (example) which would probably be more reliable.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Wiktor Stribiżew |
| Solution 2 | RoToRa |
