为什么木偶操纵者报告UnhandledPromiseRejectionWarning:Error:导航失败,因为浏览器已断开连接!"? [英] Why is puppeteer reporting "UnhandledPromiseRejectionWarning: Error: Navigation failed because browser has disconnected!"?

查看:24
本文介绍了为什么木偶操纵者报告UnhandledPromiseRejectionWarning:Error:导航失败,因为浏览器已断开连接!"?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个简单的node.js脚本来捕获几个网页的屏幕截图。似乎我在使用异步/等待的过程中遇到了问题,但是我不知道在哪里。我当前使用的是Pupeteer v1.11.0。

const puppeteer = require('puppeteer');

//a list of sites to screenshot
const papers =
{
     nytimes: "https://www.nytimes.com/",
     wapo: "https://www.washingtonpost.com/"
};

//launch puppeteer, do everything in .then() handler
puppeteer.launch({devtools:false}).then(function(browser){

//create a load_page function that returns a promise which resolves when screenshot is taken
async function load_page(paper){

    const url = papers[paper];

    return new Promise(async function(resolve, reject){

        const page = await browser.newPage();
        await page.setViewport({width:1024, height: 768});

        //screenshot on first console message
        page.once("console", async console_msg => {        
            await page.pdf({path: paper + '.pdf',
                            printBackground:true,
                            width:'1024px',
                            height:'768px',
                            margin: {top:"0px", right:"0px", bottom:"0px", left:"0px"}
                        });

            //close page
            await page.close();

            //resolve promise
            resolve();
        });

        //go to page
        await page.goto(url, {"waitUntil":["load", "networkidle0"]});

    })     
}

//step through the list of papers, calling the above load_page()
async function stepThru(){
    for(var p in papers){
        if(papers.hasOwnProperty(p)){
            //wait to load page and screenshot before loading next page
            await load_page(p);
        }
    }

    //close browser after loop has finished (and all promises resolved)
    await browser.close();  
}

//kick it off
stepThru();

//getting this error message:
//UnhandledPromiseRejectionWarning: Error: Navigation failed because browser has disconnected!

});

推荐答案

Navigation failed because browser has disconnected错误通常表示启动Puppeteer的节点脚本在没有等待Puppeteer操作完成的情况下结束。因此,如您所述,某些等待是一个问题。

关于您的脚本,我做了一些更改以使其正常工作:

  1. 首先,您没有等待stepThru函数的(异步)结束,因此请更改
stepThru();

await stepThru();

puppeteer.launch({devtools:false}).then(function(browser){

puppeteer.launch({devtools:false}).then(async function(browser){

(我添加了async)

  1. 我更改了您管理gotopage.once承诺的方式

PDF承诺现在为:

new Promise(async function(resolve, reject){
    //screenshot on first console message
    page.once("console", async () => {
      await page.pdf({
        path: paper + '.pdf', 
        printBackground:true, 
        width:'1024px', 
        height:'768px', 
        margin: {
          top:"0px", 
          right:"0px", 
          bottom:"0px", 
          left:"0px"
        } 
      });
      resolve();
    });
})

它只有一个职责,就是创建PDF。

  1. 然后我用Promise.all
  2. 管理了page.goto和PDF承诺
await Promise.all([
    page.goto(url, {"waitUntil":["load", "networkidle2"]}),
    new Promise(async function(resolve, reject){
        // ... pdf creation as above        
    })
]);
  1. 我将page.close移到Promise.all
  2. 之后
await Promise.all([
    // page.goto
    // PDF creation
]);

await page.close();
resolve();

现在它可以工作了,这里是完整的工作脚本:

const puppeteer = require('puppeteer');

//a list of sites to screenshot
const papers =
{
  nytimes: "https://www.nytimes.com/",
  wapo: "https://www.washingtonpost.com/"
};

//launch puppeteer, do everything in .then() handler
puppeteer.launch({devtools:false}).then(async function(browser){

  //create a load_page function that returns a promise which resolves when screenshot is taken
  async function load_page(paper){
    const url = papers[paper];
    return new Promise(async function(resolve, reject){
      const page = await browser.newPage();
      await page.setViewport({width:1024, height: 768});

      await Promise.all([
        page.goto(url, {"waitUntil":["load", "networkidle2"]}),
        new Promise(async function(resolve, reject){

          //screenshot on first console message
          page.once("console", async () => {
            await page.pdf({path: paper + '.pdf', printBackground:true, width:'1024px', height:'768px', margin: {top:"0px", right:"0px", bottom:"0px", left:"0px"} });
            resolve();
          });
        })
      ]);

      await page.close();
      resolve();
    })
  }

  //step through the list of papers, calling the above load_page()
  async function stepThru(){
    for(var p in papers){
      if(papers.hasOwnProperty(p)){
        //wait to load page and screenshot before loading next page
        await load_page(p);
      }
    }

    await browser.close();
  }

  await stepThru();
});

请注意:

  • 我将networkidle0更改为networkidle2是因为nytimes.com网站需要很长时间才能达到0网络请求状态(由于AD等原因)。您显然可以等待networkidle0,但这由您决定,这超出了您的问题范围(在这种情况下增加page.goto超时)。

  • www.washingtonpost.com站点转到TOO_MANY_REDIRECTS错误,因此我更改为washingtonpost.com,但我认为您应该对其进行更多调查。为了测试该脚本,我在nytimes站点和其他网站上使用了更多次。再次声明:这超出了您的问题范围。

如果您需要更多帮助,请告诉我😉

这篇关于为什么木偶操纵者报告UnhandledPromiseRejectionWarning:Error:导航失败,因为浏览器已断开连接!"?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆