'JS Regex to find href of several a tags
I need a regex to find the contents of the hrefs from these a tags :
<p class="bc_shirt_delete">
<a href="/CustomContentProcess.aspx?CCID=13524&OID=3936923&A=Delete" onclick="javascript:return confirm('Are You sure you want to delete this item?')">delete</a>
</p>
Just the urls, not the href/ tags.
I'm parsing a plain text ajax request here, so I need a regex.
Solution 1:[1]
This will do it nicely. http://jsfiddle.net/grantk/cvBae/216/
Regex example: https://regex101.com/r/nLXheV/1
var str = '<p href="missme" class="test"><a href="/CustomContentProcess.aspx?CCID=13524&OID=3936923&A=Delete" onclick="">delete</a></p>'
var patt = /<a[^>]*href=["']([^"']*)["']/g;
while(match=patt.exec(str)){
alert(match[1]);
}
Solution 2:[2]
Here is a robust solution:
let href_regex = /<a([^>]*?)href\s*=\s*(['"])([^\2]*?)\2\1*>/i,
link_text = '<a href="/another-article/">another article link</a>',
href = link_text.replace ( href_regex , '$3' );
What it does:
- detects a tags
- lazy skips over other HTML attributes and groups (1) so you DRY
- matches
hrefattribute - takes in consideration possible whitespace around
= - makes a group (2) of
'and"so you DRY - matches anything but group (1) and groups (3) it
- matches the group (2) of
'and" - matches the group (1) (other attributes)
- matches whatever else is there until closing the tag
- set proper flags
iignore case
Solution 3:[3]
You may don't need Regex to do that.
o = document.getElementsByTagName('a');
urls = Array();
for (i =0; i < o.length; i++){
urls[i] = o[i].href;
}
If it is a plain text, you may insert it into a displayed non DOM element, i.e display: none, and then deal with it regularly in a way like I described.
Solution 4:[4]
It might be easier to use jQuery
var html = '<li><h2 class="saved_shirt_name">new shirt 1</h2><button class="edit_shirt">Edit Shirt</button><button class="delete_shirt" data-eq="0" data-href="/CustomContentProcess.aspx?CCID=13524&OID=3936923&A=Delete">Delete Shirt</button></li><li><h2 class="saved_shirt_name">new shirt 2</h2><button class="edit_shirt">Edit Shirt</button><button class="delete_shirt" data-eq="0" data-href="/CustomContentProcess.aspx?CCID=13524&OID=3936924&A=Delete">Delete Shirt</button></li><li><h2 class="saved_shirt_name">new shirt 3</h2><button class="edit_shirt">Edit Shirt</button><button class="delete_shirt" data-eq="0" data-href="/CustomContentProcess.aspx?CCID=13524&OID=3936925&A=Delete">Delete Shirt</button></li>';
$(html).find('[data-href]');
And iterate each node
UPDATE (because post updated)
Let html be your raw response
var matches = $(html).find('[href]');
var hrefs = [];
$.each(matches, function(i, el){ hrefs.push($(el).attr('href'));});
//hrefs is an array of matches
Solution 5:[5]
I combined a few solutions around and came up with this (Tested in .NET):
(?<=href=[\'\"])([^\'\"]+)
Explanation:
(?<=) : look behind so it wont include these characters
[\'\"] : match both single and double quote
[^] : match everything else except the characters after '^' in here
+ : one or more occurrence of last character.
This works well and is not greedy with the quote as it would stop matching the moment it finds a quote
Solution 6:[6]
var str = "";
str += "<p class=\"bc_shirt_delete\">";
str += "<a href=\"/CustomContentProcess.aspx?CCID=13524&OID=3936923&A=Delete\" onclick=\"javascript:return confirm('Are You sure you want to delete this item?')\">delete</a>";
str += "</p>";
var matches = [];
str.replace(/href=("|')(.*?)("|')/g, function(a, b, match) {
matches.push(match);
});
console.log(matches);
or if you don't care about the href:
var matches = str.match(/href=("|')(.*?)("|')/);
console.log(matches);
Solution 7:[7]
how about spaces around = ? this code will fix it:
var matches = str.match(/href( *)=( *)("|'*)(.*?)("|'*)( |>)/);
console.log(matches);
Solution 8:[8]
It's important to be non-greedy. And to cater for —matching— ' or "
test = "<a href="#" class="foo bar"> banana
<a href='http://google.de/foo?yes=1&no=2' data-href='foobar'/>"
test.replace(/href=(?:\'.*?\'|\".*?\")/gi,'');
disclaimer: The one thing it does not catch is html5 attribs data-href...
Solution 9:[9]
In this specified case probably this is fastest pregmatch:
/f="([^"]*)/
- gets ALL signs/characters (letters, numbers, newline signs etc.) form f=" to nearest next ", excluding it, flags for example /is are unnecesary, return null if empty
but if the source contains lots of other links, it will be necessary to determine that this is exactly the one you are looking for and here we can do it this way, just include in your pregmatch more of the source code, for example (of course its depend from source site code...)
/bc_shirt_delete">\s*<a href="([^"]*)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Community |
| Solution 3 | |
| Solution 4 | |
| Solution 5 | EBFE |
| Solution 6 | |
| Solution 7 | bummi |
| Solution 8 | Frank Nocke |
| Solution 9 |

