'Puppeteer crashes Chromium while scrolling Whatsapp Web
I'm trying to use Puppeteer to scroll to the last message of a Whatsapp chat so I can get the HTML of the whole convo and some scraping of that data. My current program does scroll up but crashes midway. It is a pretty large convo (>40k messages) but I wanted to know if the reason is the size or if there's something I can do to prevent this. I've tried to play with the waiting times between scrolls since I thought I was too fast but it didn't matter.
Here's my code:
const CONTACT_NAME = 'name';
const puppeteer = require('puppeteer');
const fs = require('fs/promises');
(async () => {
const browser = await puppeteer.launch({
defaultViewport: null,
headless: false,
args: ['--no-sandbox', '--disable-setuid-sandbox'],
}); // default is true
const page = await browser.newPage();
await page.goto('https://web.whatsapp.com');
await page.setViewport({
width: 800,
height: 800,
});
await page.waitForSelector(`[title="${CONTACT_NAME}"]`, { timeout: 1000000 });
await page.click(`[title="${CONTACT_NAME}"]`);
await page.waitForTimeout(1 * 20 * 1000);
const scrollable_section = '_33LGR';
function scroll() {
return page.evaluate((selector) => {
const scrollableSection = document.getElementsByClassName(selector)[0];
return new Promise((resolve, reject) => {
console.log(scrollableSection);
let previousHeight = scrollableSection.scrollHeight;
scrollableSection.scrollTop = 0;
let counter = 0;
const timer = setInterval(() => {
const scrollHeight = scrollableSection.scrollHeight;
console.log({ previousHeight, scrollHeight });
if (previousHeight === scrollHeight) {
counter++;
if (counter === (2 * 60 * 1000) / 3000) {
clearInterval(timer);
resolve();
}
} else {
scrollableSection.scrollTop = 0;
previousHeight = scrollHeight;
counter = 0;
}
}, 3000);
});
}, scrollable_section);
}
await scroll();
const html = await page.content();
await fs.writeFile('output.html', html);
await page.screenshot({ path: 'page.png' });
await browser.close();
})();
And here's the error message in Chromium ("Aw Snap! something went wrong while displaying this webpage. Error Code: 5.) :
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
