'How to get a HTML5 element by XPath by LibXML2 in C++
I want to get div HTML tag by XPath by LibXml2 in C++ but it finds nothing while I have many div tags in the HTML. When I put something like /html/body/div[1]/div/div it even crashes.
htmlParserCtxtPtr parse_ctx = htmlCreateMemoryParserCtxt(resp.text.c_str(), resp.text.size());
if (!parse_ctx) {
std::cout << "Error!" << std::endl;
return;
}
xmlXPathContextPtr xml_ctx = xmlXPathNewContext(parse_ctx->myDoc);
if (!xml_ctx) {
std::cout << "Error!" << std::endl;
return;
}
xmlXPathObjectPtr xpath_obj = xmlXPathEvalExpression((xmlChar *)"//div", xml_ctx);
if (!xpath_obj) {
std::cout << "Error!" << std::endl;
return;
}
xmlNodeSetPtr nodes = xpath_obj->nodesetval;
std::cout << nodes->nodeNr << std::endl; // result is 0.
I also tried to use htmlParseElement function instead of XML functions, but it shows error on HTML5 tags (unknown tag error).
Solution 1:[1]
htmlCreateMemoryParserCtxt only creates a parser context and doesn't parse the document yet, so parse_ctx->myDoc will be NULL. Try htmlReadMemory which parses the document and returns an xmlDocPtr.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | nwellnhof |
