'Puppeteer wait until page is completely loaded

I am working on creating PDF from web page.

The application on which I am working is single page application.

I tried many options and suggestion on https://github.com/GoogleChrome/puppeteer/issues/1412

But it is not working

    const browser = await puppeteer.launch({
    executablePath: 'C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe',
    ignoreHTTPSErrors: true,
    headless: true,
    devtools: false,
    args: ['--no-sandbox', '--disable-setuid-sandbox']
});

const page = await browser.newPage();

await page.goto(fullUrl, {
    waitUntil: 'networkidle2'
});

await page.type('#username', 'scott');
await page.type('#password', 'tiger');

await page.click('#Login_Button');
await page.waitFor(2000);

await page.pdf({
    path: outputFileName,
    displayHeaderFooter: true,
    headerTemplate: '',
    footerTemplate: '',
    printBackground: true,
    format: 'A4'
});

What I want is to generate PDF report as soon as Page is loaded completely.

I don't want to write any type of delays i.e. await page.waitFor(2000);

I can not do waitForSelector because the page has charts and graphs which are rendered after calculations.

Help will be appreciated.



Solution 1:[1]

You can use page.waitForNavigation() to wait for the new page to load completely before generating a PDF:

await page.goto(fullUrl, {
  waitUntil: 'networkidle0',
});

await page.type('#username', 'scott');
await page.type('#password', 'tiger');

await page.click('#Login_Button');

await page.waitForNavigation({
  waitUntil: 'networkidle0',
});

await page.pdf({
  path: outputFileName,
  displayHeaderFooter: true,
  headerTemplate: '',
  footerTemplate: '',
  printBackground: true,
  format: 'A4',
});

If there is a certain element that is generated dynamically that you would like included in your PDF, consider using page.waitForSelector() to ensure that the content is visible:

await page.waitForSelector('#example', {
  visible: true,
});

Solution 2:[2]

Sometimes the networkidle events do not always give an indication that the page has completely loaded. There could still be a few JS scripts modifying the content on the page. So watching for the completion of HTML source code modifications by the browser seems to be yielding better results. Here's a function you could use -

const waitTillHTMLRendered = async (page, timeout = 30000) => {
  const checkDurationMsecs = 1000;
  const maxChecks = timeout / checkDurationMsecs;
  let lastHTMLSize = 0;
  let checkCounts = 1;
  let countStableSizeIterations = 0;
  const minStableSizeIterations = 3;

  while(checkCounts++ <= maxChecks){
    let html = await page.content();
    let currentHTMLSize = html.length; 

    let bodyHTMLSize = await page.evaluate(() => document.body.innerHTML.length);

    console.log('last: ', lastHTMLSize, ' <> curr: ', currentHTMLSize, " body html size: ", bodyHTMLSize);

    if(lastHTMLSize != 0 && currentHTMLSize == lastHTMLSize) 
      countStableSizeIterations++;
    else 
      countStableSizeIterations = 0; //reset the counter

    if(countStableSizeIterations >= minStableSizeIterations) {
      console.log("Page rendered fully..");
      break;
    }

    lastHTMLSize = currentHTMLSize;
    await page.waitForTimeout(checkDurationMsecs);
  }  
};

You could use this after the page load / click function call and before you process the page content. e.g.

await page.goto(url, {'timeout': 10000, 'waitUntil':'load'});
await waitTillHTMLRendered(page)
const data = await page.content()

Solution 3:[3]

In some cases, the best solution for me was:

await page.goto(url, { waitUntil: 'domcontentloaded' });

Some other options you could try are:

await page.goto(url, { waitUntil: 'load' });
await page.goto(url, { waitUntil: 'domcontentloaded' });
await page.goto(url, { waitUntil: 'networkidle0' });
await page.goto(url, { waitUntil: 'networkidle2' });

You can check this at puppeteer documentation: https://pptr.dev/#?product=Puppeteer&version=v11.0.0&show=api-pagewaitfornavigationoptions

Solution 4:[4]

I always like to wait for selectors, as many of them are a great indicator that the page has fully loaded:

await page.waitForSelector('#blue-button');

Solution 5:[5]

In the latest Puppeteer version, networkidle2 worked for me:

await page.goto(url, { waitUntil: 'networkidle2' });

Solution 6:[6]

Wrap the page.click and page.waitForNavigation in a Promise.all

  await Promise.all([
    page.click('#submit_button'),
    page.waitForNavigation({ waitUntil: 'networkidle0' })
  ]);

Solution 7:[7]

I encountered the same issue with networkidle when I was working on an offscreen renderer. I needed a WebGL-based engine to finish rendering and only then make a screenshot. What worked for me was a page.waitForFunction() method. In my case the usage was as follows:

await page.goto(url);
await page.waitForFunction("renderingCompleted === true")
const imageBuffer = await page.screenshot({});

In the rendering code, I was simply setting the renderingCompleted variable to true, when done. If you don't have access to the page code you can use some other existing identifier.

Solution 8:[8]

You can also use to ensure all elements have rendered

await page.waitFor('*')

Reference: https://github.com/puppeteer/puppeteer/issues/1875

Solution 9:[9]

As for December 2020, waitFor function is deprecated, as the warning inside the code tell:

waitFor is deprecated and will be removed in a future release. See https://github.com/puppeteer/puppeteer/issues/6214 for details and how to migrate your code.

You can use:

sleep(millisecondsCount) {
    if (!millisecondsCount) {
        return;
    }
    return new Promise(resolve => setTimeout(resolve, millisecondsCount)).catch();
}

And use it:

(async () => {
    await sleep(1000);
})();

Solution 10:[10]

I can't leave comments, but I made a python version of Anand's answer for anyone who finds it useful (i.e. if they use pyppeteer).

async def waitTillHTMLRendered(page: Page, timeout: int = 30000): 
    check_duration_m_secs = 1000
    max_checks = timeout / check_duration_m_secs
    last_HTML_size = 0
    check_counts = 1
    count_stable_size_iterations = 0
    min_stabe_size_iterations = 3

    while check_counts <= max_checks:
        check_counts += 1
        html = await page.content()
        currentHTMLSize = len(html); 

        if(last_HTML_size != 0 and currentHTMLSize == last_HTML_size):
            count_stable_size_iterations += 1
        else:
            count_stable_size_iterations = 0 # reset the counter

        if(count_stable_size_iterations >= min_stabe_size_iterations):
            break
    

        last_HTML_size = currentHTMLSize
        await page.waitFor(check_duration_m_secs)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Arel
Solution 3
Solution 4 Nicolás A.
Solution 5 attacomsian
Solution 6
Solution 7 Dharman
Solution 8 Phat Tran
Solution 9 Or Assayag
Solution 10 Erik Tillberg