'Using RegEx to extract data from an anchor tag

I have the following anchor tag in an html document that I want to extract the link and the text from:

<a href="https://www.catholicgallery.org/bible-drb/acts-9/">Acts 9:</a> 1-20

I have tried using two different methods.

calling TestRegEx with 
        IEnumerable <Tuple<string, string, string>> tuple = TestRegEx(reading.readinghRef);
where TestRegEx is:
    protected IEnumerable<Tuple<string, string, string>> TestRegEx (string html)
    {
        Regex r = new Regex(@"<a.*?href=(""|')(?<href>.*?)(""|').*?>(?<value>.*?)</a>\s(?<verses>.*?)");

        foreach (Match match in r.Matches(html))
            yield return new Tuple<string, string, string>(
                match.Groups["href"].Value, match.Groups["value"].Value, match.Groups["verses"].Value);
    }
I have also tried:
            Regex regex = new Regex(@"<a\shref=""(?<url>.*?)"">(?<text>.*?):</a>\s(?<verses>.*?)");
            Match match = regex.Match(reading.readinghRef);

            string text = match.Groups["text"].Value;
            string[] textParts = text.Split(' ');
            string verses = match.Groups["verses"].Value;

            string book = "";
            for (int i = 0; i < textParts.Length - 1; i++)
            {
                if (book.Length > 0)
                    book += " ";
                book += textParts[i];
            }

            string chapter = textParts[textParts.Length - 1];
They both succeed in getting the book and the url, but fail to get the verses. Item 2 in the tuple is not yet parsed to book and chapter. That is not the problem. The problem is not getting the verses at the end of the html string.


Solution 1:[1]

The only problem with your first regex is the non-greedy

(?<verses>.*?)

Replace with the greedy version, and you'll get the verses.

(?<verses>.*)

https://regex101.com/r/w2lgaX/1

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ejkeep