'How to use non-ascii characters in the regex filter of the declarativeNetRequest API for a Google Chrome extension?
I am creating a Chrome extension but just found out that it's not possible to use non-ascii characters in the regex filter of the declarativeNetRequest API. Which I need because I want to use it to match and block all requests to domains with TLD рф and дети. Is there any solution for this?
I currently use a block action in a ruleset specified with the following condition and this works fine:
"condition": {"regexFilter": "^(.*):\/\/(.*).(ru|su|tatar)\/(.*)", "resourceTypes": ["main_frame"] }
However, I still want to add TLDs рф and дети. I tried using a Unicode-encoded representation of that in the regex too but it also didn't work. So, how to use non-ascii characters in the regex filter of the declarativeNetRequest API for a Google Chrome extension? Or is there an alternative approach to achieve this?
Side-note: I feel like this question belongs more the "Web Applications" Stack Exchange, however considering the tag google-chrome-extensions has 27.000+ questions on StackOverflow and less than 100 questions on WebApps Stack Exchange I think it's more effective to post it here .
Solution 1:[1]
So for the sake of completeness. As user wOxxOm mentioned in the comments of the question. Punycode is the solution in cases of internationalized domains.
So in order to match the TLDs ?? and ????I I used this regex instead:
^(.*):\/\/(.*).(ru|su|tatar|xn--p1ai|xn--d1acj3b)\/(.*)
I converted it to Punycode using, https://www.punycoder.com/.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Bob Ortiz |
