'How can we read the html response and create JSON in NodeJS
I have a endpoint which i have to use to get some details but the response coming is in html format read to format the html and the valid JSON object as shown below
<body>
<script type="text/javascript">_page.startPage("")</script>
<!-- begin content area -->
<table id="p-ca" border="0" cellspacing="0" cellpadding="0" width=100%>
<tr id="p-carow1" valign="top">
<td id="p-column1" valign="top">
<TABLE CELLSPACING=3 CELLPADDING=3 BORDER=0>
<TR>
<td valign=top>
<br>
<font face=arial size=3><b>Search CODE1 Domain</b><br>
<form name="search" action=code_users.asp>
<font size=2>Search for: <input name="user" type="text" size=25 onfocus="this.select();" value="320061731"><br><font color="grey" size="1"> Wildcards are not necessary and will be ignored</font><br><br>
<input type="submit" value="Find">
</form>
<script>
document.forms[0].elements[0].focus();
</script>
<br>
<br>
<table width=100% border=0 cellspacing=0>
<tr bgcolor=cee3e7><td width=100><font face=Arial size=3><b>Username</b></td><td width=150><font face=Arial size=3><b>Full Name</b></td><td width=300><font face=Arial size=3><b>Description</b></td><td><font face=Arial size=3><b>Employee #</b></font></td><td width=100><font face=Arial size=3><b>Location</b></td><td style="padding-left:10px;"><font face=Arial size=3><b>Email</b></td><td style="padding-left:10px;"><font face=Arial size=3><b>Company</b></td><td style="padding-left:10px;"><font face=Arial size=3><b>Disabled?</b></td><!--<td width=100><font face=Arial size=3><b>Last Login</b></td>--></tr>
***<tr><td colspan=2> </td></tr>
<tr bgcolor=FFFFFF><td valign=top><font face="Courier New" size=2>320061731</td><td valign=top><font face="Courier New" size=2>Dhanalakshssmi R</td><td valign=top><font face="Courier New" size=2>Dhanalakshmi R</td><td valign=top><font face="Courier New" size=2>32320061731</td><td valign=top><font face="Courier New" size=2></td><td valign=top style="padding-left:10px; padding-right:10px;"><font face="Courier New" size=2>[email protected]</td><td valign=top><font face="Courier New" size=2>Cibersites</td><td valign=top align=center><font face="Courier New" size=2>False</td></tr>***
</table>
</td>
</tr>
</table>
<!-- end content area -->
<script type="text/javascript">_page.endPage()</script>
<a href="site_data" class="p-robots"></a>
</body>
need to fetch the details 32320061731 .... name etc from the trying to do in nodejs Please let me know how to parse the response from the get request
Solution 1:[1]
Hello Dhanlaxmi you have to implement web scraping, if you want to extract data from html page.
use npm package
const cheerio = require("cheerio");
complete ref : https://medium.com/@stefanhyltoft/scraping-html-tables-with-nodejs-request-and-cheerio-e3c6334f661b
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | vishal sharma |
