'Correct way to add comments to a Regular Expression in PHP
I'm trying to add comments to make a regexp clearer
// Strip any URLs (such as embeds) taken from http://stackoverflow.com/questions/6427530/regular-expression-pattern-to-match-url-with-or-without-http-www
$pattern =
'( # First capturing group
(http|https) # Second capturing grout,matches wither http or https
\:\/\/)? # End of first capturing group, matches :// exactly
[ # Match any char in the following list. the + after the closing bracke means greedy
a-z # Any char between a and z
A-Z # Any char between A and Z
0-9 # Any char between 0 and 9
\.\/\?\:@\- # ./?:@- literally ( any one of them )
_=# # _=# any of these thre chars
]+ # end of list
\. # matches .
( # third caturing group
[ # start of list
a-z # Any char between a and z
A-Z # Any char between A and Z
0-9 # Any char between 0 and 9
\.\/\?\:@\- # ./?:@- literally ( any one of them )
_=# # _=# any of these thre chars
] # end of list
)* # end of capturing group with greedy modifier';
$excerpt = preg_replace("/$pattern/x", '', $excerpt );
But i get the warning
Warning: preg_replace(): Unknown modifier '/' in on line 280
How should i comment it?
Solution 1:[1]
This was brought up in a comment on the php.net modifiers page.
To quote:
When adding comments with the /x modifier, don't use the pattern delimiter in the comments. It may not be ignored in the comments area.
In your example, one of your comments has the string :// embedded within it. Since PHP seems to not parse regex delimiters by taking into account flags, it sees this as a problem. The same can be seen with the below code:
echo preg_replace('/
a #Com/ment
/x', 'e', 'and');
You would need to either change your delimiter or escape the delimiter in comments.
Solution 2:[2]
While it has already been said that the problem in your snippet comes from using the pattern delimiter in your pattern comments, completely refactoring the pattern to implement D.R.Y. practices will make your regex much simpler to read and maintain.
- Use a delimiting character that will not be found inside your pattern -- this eliminates avoidable escaping.
((http|https)\:\/\/)?can be simplified to(?:https?://)?and still maintain its optional status in the pattern.- Your alphanumeric character class plus a short list of symbols can be reduced to
[\w./?:@=#-]+.
Code:
// strip urls
$pattern = <<<REGEX
~
(?:https?://)? # optionally, case-insensitively match http or https followed by colon, forwardslash, forwardslash
[\w./?:@=#-]+ # greedily match one or more characters from this list: any letters, any number, underscore, dot, forwardslash, question mark, colon, ampersand, equals, hash, hyphen
\. # match a dot
[\w./?:@=#-]* # greedily match zero or more characters from this list: any letters, any number, underscore, dot, forwardslash, question mark, colon, ampersand, equals, hash, hyphen
~ix
REGEX;
$excerpt = preg_replace($pattern, '', $excerpt);
After cleaning up and removing all of the bloat from your pattern, it may actually become attractive to encapsulate all of the inline comments as a comment prior to declaring the pattern because this affords the ability to wrap long lines onto newlines without breaking your pattern.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Anonymous |
| Solution 2 | mickmackusa |
