'How to scrape data from an almost empty html in asp.net
So basically website's html looks like this:
<html>
<head></head>
<body>123445.87</body>
</html>
I need to scrape those numbers that are in the body, that's all there is to that html. How do I do that? Because when I debug this program it just says "Error: no data scraped". Here's my code:
[ApiController]
public class ValuesController : ControllerBase
{
[Route("scrape")]
[AcceptVerbs("GET")]
public async Task<List<string>> GetValue()
{
List<string> Datalst = new List<string>();
HttpClient hc = new HttpClient();
HttpResponseMessage result = await hc.GetAsync($"https://mywebsite.com/");
Stream stream = await result.Content.ReadAsStreamAsync();
HtmlDocument doc = new HtmlDocument();
doc.Load(stream);
var Value = doc.DocumentNode.SelectNodes("//html/body");
if (Value == null)
Datalst.Add("Error: no data scraped");
else
{
foreach (var item in Value)
{
Datalst.Add(item.InnerText);
}
}
return Datalst;
}
}
Solution 1:[1]
Ok, I found a solution! Basically I changed this line of code:
var Value = doc.DocumentNode.SelectNodes("//html/body");
To this:
var Value = doc.DocumentNode.InnerHtml;
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Dziamukas |
