'How to match URL using XPath regular expressions
Need help with XPath. I have such a XML:
<unaryExpression tokenValue="'http://google.com'" tokenLine="1" tokenColumn="8">
<postfixExpression tokenValue="'http://google.com'" tokenLine="1" tokenColumn="8">
<leftHandSideExpression tokenValue="'http://google.com'" tokenLine="1" tokenColumn="8">
<newExpression tokenValue="'http://google.com'" tokenLine="1" tokenColumn="8">
<memberExpression tokenValue="'http://google.com'" tokenLine="1" tokenColumn="8">
<primaryExpression tokenValue="'http://google.com'" tokenLine="1" tokenColumn="8">
<literal tokenValue="'http://google.com'" tokenLine="1" tokenColumn="8">
<stringLiteral tokenValue="'http://google.com'" tokenLine="1" tokenColumn="8">
<LITERAL tokenValue="'http://google.com'" tokenLine="1" tokenColumn="8"/>
</stringLiteral>
</literal>
</primaryExpression>
</memberExpression>
</newExpression>
</leftHandSideExpression>
</postfixExpression>
</unaryExpression>
I need to find the URL. I do it so.
//LITERAL[contains(@tokenValue, 'http://')]
How to use a regular expression to find url?
(http://|https://|ftp://)([a-z0-9]{1})((\.[a-z0-9-])|([a-z0-9-]))*\.([a-z]{2,4})(\/?)
Solution 1:[1]
If your XPath engine supports XPath 2.0, use fn:matches which equivalents fn:contains for regular expressions. With XPath 1.0, there is no support for regular expressions.
//LITERAL[fn:matches(@tokenValue, '(http://|https://|ftp://)([a-z0-9]{1})((\.[a-z0-9-])|([a-z0-9-]))*\.([a-z]{2,4})(/?)')]
Will return all <LITERAL/>-tags having an @tokenValue-tag matching your regular expression.
There is some problem in your expression, you don't have to (and may not) escape the / in the last match group. I fixed that in my query. Why are you using the last two match groups anyway?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Jens Erat |
