'I want to get the HTML of the web page, but I am getting the HTML of my chrome extension instead

I want to extract the HTML of a webpage so that I can analyze it and send a notification to my chrome extension. Sort of like how an adblocker does it when analyzing a web page for ads and then tell the extension how many possible ads there are.

I am trying to use the document object in content-scripts to get the HTML, however, I always seem to get the HTML of my popup file instead. Can anybody help?

content-script.js


chrome.tabs.onActivated.addListener(function(activeInfo) {
  chrome.tabs.get(activeInfo.tabId, function(tab) {
    console.log("[content.js] onActivated");

    chrome.tabs.sendMessage(
      activeInfo.tabId,
      {
        content: document.all[0].innerText,
        type: "from_content_script",
        url: tab.url
      },
      {},
      function(response) {
        console.log("[content.js]" + window.document.all[0].innerText);
      }
    );
  });
});

chrome.tabs.onUpdated.addListener((tabId, change, tab) => {
  if (tab.active && change.url) {
    console.log("[content.js] onUpdated");

    chrome.tabs.sendMessage(
      tabId,
      {
        content: document.all[0].innerText,
        type: "from_content_script",
        url: change.url
      },
      {},
      function(response) {
        console.log("[content.js]" + window.document.all[0].innerText);
      }
    );
  }
});

background.js


let messageObj = {};
chrome.runtime.onMessage.addListener((message, sender, sendResponse) => {
  // Arbitrary string allowing the background to distinguish
  // message types. You might also be able to determine this
  // from the `sender`.
  if (message.type === "from_content_script") {
    messageObj = message;
  } else if (message.type === "from_popup") {
    sendResponse(messageObj);
  }
});

manifest.json

{
  "short_name": "Extension",
  "version": "1.0.0",
  "manifest_version": 3,
  "name": "My Extension",
  "description": "My Extension Description",
  "permissions": ["identity", "activeTab", "tabs"],
  "icons": {
    "16": "logo-16.png",
    "48": "logo-48.png",
    "128": "logo-128.png"
  },
  "action": {
    "default_icon": "ogo_alt-16.png",
    "default_popup": "popup.html"
  },
  "content_scripts": [
    {
      "matches": ["http://*/*", "https://*/*"],
      "js": ["./static/js/content-script.js"],
      "run_at": "document_end"
    }
  ],
  "background": {
    "service_worker": "./static/js/background.js"
  }
}



Solution 1:[1]

Your current content script is nonfunctional because content scripts cannot access chrome.tabs API. If it kinda worked for you, the only explanation is that you loaded it in the popup, which is wrong because the popup is not a web page, it's a separate page with a chrome-extension:// URL.

For your current goal, there's no need for the background script at all because you can simply send a message from the popup to the content script directly to get the data. Since you're showing the info on demand there's also no need to run the content scripts all the time in all the sites i.e. you can remove content_scripts from manifest.json and inject the code on demand from the popup.

TL;DR. Remove content_scripts and background from manifest.json, remove background.js and content-script.js files.

manifest.json:

  "permissions": ["activeTab", "scripting"],

popup.html:

<body>
your UI
<script src=popup.js></script>
</body>

popup.js:

(async () => {
  const [tab] = await chrome.tabs.query({active: true, currentWindow: true});
  let result;
  try {
    [{result}] = await chrome.scripting.executeScript({
      target: {tabId: tab.id},
      func: () => document.documentElement.innerText,
    });
  } catch (e) {
    document.body.textContent = 'Cannot access page';
    return;
  }
  // process the result
  document.body.textContent = result;
})();

If you want to to analyze the page automatically and display some number on the icon then you will need the background script and possibly content_scripts in manifest.json, but that's a different task.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1