'Convert deprecated @include with regular expression to @match in userscript

Tampermonkey has a deprecation warning in for the @include statement for my user scripts:

// @include /https\:\/\/([a-z\.]*\.)?(((stackexchange|askubuntu|superuser|serverfault|stackoverflow|stackapps)\.com)|(mathoverflow\.net))\/.*/
// @exclude /^https://(chat|api|data)\./
// @exclude https://stackexchange.com/*

eslint: userscripts/better-use-match - Using @include is potentially unsafe and may be obsolete in Manifest v3 in early 2023. Please switch to @match.

The documentation for @match says:

More or less equal to the @include tag. You can get more information here. Note: the <all_urls> statement is not yet supported and the scheme part also accepts http*://.

Multiple tag instances are allowed.

However, despite this less-than-helpful documentation they are not equivalent at all. This doesn't work:

// @match /https\:\/\/([a-z\.]*\.)?(((stackexchange|askubuntu|superuser|serverfault|stackoverflow|stackapps)\.com)|(mathoverflow\.net))\/.*/

The here link makes no mention of regular expressions at all! How do I convert this regular expression to work in @match?



Solution 1:[1]

@match doesn't support regular expressions at all, it only supports globbing. You will need to convert your regular expression into multiple globs.

The way that @match is processed is that the directive is split into three and the parts are globbed against various parts of the URL separately:

// @match PROTOCOL://HOSTNAME/PATH

This is done differently than include where the include directive was matched against the entire URL. See: What is the difference between @include and @match in userscripts?

// @include https://*.example.com/* had a potential security vulnerability because an attacker could craft a URL like https://attacker.example/?.example.com that would allow your userscript to run on the attacker's domain. Depending on what your userscript does, it might allow the attacker to use your script maliciously to steal data from your users, pown your users, or use them as part of a DDOS.

If your regular expressions were choosing between several different domains, you will need to break your @include regular expression into many @match directives with globs. Note that when matching host names, *.stackoverflow.com also matches stackoverflow.com with no subdomains.

// @match https://*.stackexchange.com/*
// @match https://*.stackoverflow.com/*
// @match https://*.askubuntu.com/*
// @match https://*.superuser.com/*
// @match https://*.serverfault.com/*
// @match https://*.mathoverflow.net/*
// @match https://*.stackapps.com/*
// @exclude /^https://(chat|api|data)\./
// @exclude https://stackexchange.com/*

Because globs are less expressive than regular expressions, some @include directives will not be able to be expressed as @match. If you are using a regular expression to match against very specific URL paths on a site, you may have to move the logic for determining whether or not your user script should run on a particular path into @exclude rules or into your script itself.

There are also new restrictions on globbing against host names. The wildcard must come at the beginning and must be followed by a .. So matching all TLDs with example.* is not possible, nor is matching partial domain names like *example.com. See Google's documentation for match patterns for full details.

Note: If you were previously using @exclude directives, you do not need to make any changes to those. The @exclude directive is not being deprecated. Because it excludes domains, rather than includes them, @exclude is much less likely to introduce security vulnerabilities.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1