'querySelectorAll doesn't capture all elements
I am trying to scan and manipulate DOM of a webpage the following Code:
var elements = document.querySelectorAll('*');
for (var i = 0; i < elements.length; i++) {
if (!elements[i].firstElementChild) {
if (elements[i].innerHTML != "" ){
elements[i].innerHTML = "abc_"+ elements[i].innerHTML+"_123";
}
}
}
While it works well on many pages, it is not picking up all the elements on a specific page that is my real target. On that page, it captures and edit strings of few elements, but not all. I have also tried using getElementsByTagName()
The elements that are not captured have an XPath such as:
/html/body/div[4]/div[2]/div/div[2]/div/div/div/div/div[1]/div/div[2]/nav/div/div[1]/div/span/div/text()[1]
I also noticed "flex" written in front of these elements.
I also tried the script by Douglas Crockford, but, this also is unable to catch the elements described above.
The script by Douglas is published at
function walkTheDOM(node, func) {
func(node);
node = node.firstChild;
while (node) {
walkTheDOM(node, func);
node = node.nextSibling;
}
}
// Example usage: Process all Text nodes on the page
walkTheDOM(document.body, function (node) {
if (node.nodeType === 3) { // Is it a Text node?
var text = node.data.trim();
if (text.length > 0) { // Does it have non white-space text content?
// process text
}
}
});
Any idea what am I doing wrong?
Here is a screenshot of inspect element:
[
]
Solution 1:[1]
In your snippet, you are not selecting all the nodes, since document.querySelectorAll(*) does not select the text-nodes, but only elements.
Besides, you are explicitly ignoring the text-nodes, because you specify .firstElementChild. A text-node is not an element. An element in the DOM is a "tag" like <div> for example. It has the nodeType: 1 a text-node has nodeType: 3.
So, if you'd process for example:
OuterTextNode<div>InnerTextNode</div>
the div would be the first element and Inner- and OuterTextNode are text-nodes. Both, the query selector and the .firstElementChild would only select the element (div) here.
It should work with the DOM-tree-walking code:
const blackList = ['script']; // here you could add some node names that you want to ignore
function walkTheDOM(node, func) {
func(node);
node = node.firstChild;
while (node) {
if (!blackList.includes(node.nodeName.toLowerCase())) {
walkTheDOM(node, func);
}
node = node.nextSibling;
}
}
walkTheDOM(document.body, function(node) {
if (node.nodeType === 3) {
var text = node.data.trim();
if (text.length > 0) {
console.log(text);
console.log(`replaced: PREFIX_${text}_POSTFIX`);
}
}
});
.as-console-wrapper {
top: 0;
max-height: 100% !important;
}
<div>
All
<span>In span</span> Some more text
<div>
<div>
Some nested text
<div>Sibling</div>
<span>
Another
Another
<span>
Deep
<span>
<span>
<span>
<span>
<span>Deeper</span>
</span>
</span>
</span>
</span>
</span>
</span>
</div>
<!-- Some comment !-->
<script>
// some script
const foo = 'foo';
</script>
</div>
</div>
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
