'regex to ignore a match with a special character in re.escape
I am trying to update a description field by replacing certain keywords with a hyperlinked version of said keyword. My code works; however, when the program runs again, it will find the keyword even when it is hyperlinked and try to hyperlink it again.
For example, I have a keyword "apple" which I want to replace to "[apple|https://example.com]". The code below will work to replace all keywords with the hyperlink, but I want to ignore any matches that may produce a duplicate hyperlink. I theoretically wanted to ignore matches with the character '|' after the match OR ignore matches with the character '[' in front of the match. There is most likely an easier solution for this though.
desc = "This is a very long apple description"
message = re.compile(re.escape('apple'), re.IGNORECASE)
newMsg = message.sub('[apple|https://example.com]', desc)
This is the code I have which works to replace the matches with a hyperlink, but I cant figure out how to ignore the matches that have been hyperlinked.
I've tried changing the code to:
message = re.compile(re.escape(r'apple(?!\|)'), re.IGNORECASE)
but that does not seem to work
Solution 1:[1]
I was able to get it to work by replacing my code with this:
message = re.compile(r'\bapple(?!\|)', flags=re.IGNORECASE)
desc = message.sub('[apple|http:example.com', desc)
If there is a better solution, please let me know.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | pastacosta |
