node.js异步/等待或通用池导致无限循环? [英] node.js async/await or generic-pool causes infinite loop?

查看:95
本文介绍了node.js异步/等待或通用池导致无限循环?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图创建一个自动化的工作脚本,应该使用多个操纵p实例同时处理输入字符串. 任务队列和操纵up实例的数量由软件包通用池控制, 奇怪的是,当我在ubuntu或debian上运行脚本时,似乎陷入了无限循环.尝试运行无限数量的操纵up实例.在Windows上运行时,输出正常.

I was trying to create an automation script for work, it is supposed to use multiple puppeteer instances to process input strings simultaneously. the task queue and number of puppeteer instances are controlled by the package generic-pool, strangely, when i run the script on ubuntu or debian, it seems that it fells into an infinite loop. tries to run infinite number of puppeteer instances. while when run on windows, the output was normal.

const puppeteer = require('puppeteer');
const genericPool = require('generic-pool');
const faker = require('faker');
let options = require('./options');
let i = 0;
let proxies = [...options.proxy];

const pool = genericPool.createPool({
    create: async () => {
        i++;
        console.log(`create instance ${i}`);
        if (!proxies.length) {
            proxies = [...options.proxy];
        }
        let {control = null, proxy} = proxies.pop();
        let instance = await puppeteer.launch({
            headless: true,
            args: [
                `--proxy-server=${proxy}`,
            ]
        });
        instance._own = {
            proxy,
            tor: control,
            numInstance: i,
        };
        return instance;
    },
    destroy: async instance => {
        console.log('destroy instance', instance._own.numInstance);
        await instance.close()
    },
}, {
    max: 3, 
    min: 1, 
});

async function run(emails = []) {
    console.log('Processing', emails.length);
    const promises = emails.map(email => {
        console.log('Processing', email)
        pool.acquire()
            .then(browser => {
                console.log(`${email} handled`)
                pool.destroy(browser);})
    })
    await Promise.all(promises)
    await pool.drain();
    await pool.clear();
}

let emails = [a,b,c,d,e,];
run(emails)

输出

create instance 1
Processing 10
Processing Stacey_Haley52
Processing Polly.Block
create instance 2
Processing Shanny_Hudson59
Processing Vivianne36
Processing Jayda_Ullrich
Processing Cheyenne_Quitzon
Processing Katheryn20
Processing Jamarcus74
Processing Lenore.Osinski
Processing Hobart75
create instance 3
create instance 4
create instance 5
create instance 6
create instance 7
create instance 8
create instance 9

是因为我的异步功能吗?我该如何解决? 感谢您的帮助!

is it because of my async functions? How can I fix it? Appreciate your help!

编辑1.根据@James建议修改

Edit 1. modified according to @James suggested

推荐答案

您要解决的主要问题

应该使用多个操纵up实例同时处理输入字符串.

It is supposed to use multiple puppeteer instances to process input strings simultaneously.

承诺队列

您可以使用涉及简单的Promise队列的相当简单的解决方案.我们可以根据需要使用p-queue包来限制并发.我在多个抓取项目中使用了此工具,以便始终对它们进行测试.

Promise Queue

You can use a rather simple solution that involves a simple promise queue. We can use p-queue package to limit the concurrency as we wish. I used this on multiple scraping projects to always test things out.

这是您使用它的方式.

// emails to handle
let emails = [a, b, c, d, e, ];

// create a promise queue
const PQueue = require('p-queue');

// create queue with concurrency, ie: how many instances we want to run at once
const queue = new PQueue({
    concurrency: 1
});

// single task processor
const createInstance = async (email) => {
    let instance = await puppeteer.launch({
        headless: true,
        args: [
            `--proxy-server=${proxy}`,
        ]
    });
    instance._own = {
        proxy,
        tor: control,
        numInstance: i,
    };
    console.log('email:', email)
    return instance;
}

// add tasks to queue
for (let email of emails) {
    queue.add(async () => createInstance(email))
}

通用池无限循环问题

我从示例代码中删除了所有与操纵up相关的代码,并看到它仍然如何产生无限的输出到控制台.

Generic Pool Infinite Loop Problem

I removed all kind of puppeteer related code from your sample code and saw how it was still producing the infinite output to console.

create instance 70326
create instance 70327
create instance 70328
create instance 70329
create instance 70330
create instance 70331
...

现在,如果您进行几次测试,则只有在代码崩溃时,它才会抛出循环.罪魁祸首是这个pool.acquire()承诺,只是在出错时重新排队.

Now, if you test few times, you will see it will throw the loop only if you something on your code is crashing. The culprit is this pool.acquire() promise, which is just re queuing on error.

要查找导致崩溃的原因,请使用以下事件,

To find what is causing the crash, use the following events,

pool.on("factoryCreateError", function(err) {
  console.log('factoryCreateError',err);
});

pool.on("factoryDestroyError", function(err) {
  console.log('factoryDestroyError',err);
});

与此相关的一些问题:

  • acquire()永远不会解决/拒绝工厂是否总是拒绝,这里
  • 关于pool.js中的获取功能,请此处.
  • .acquire()不会在资源创建失败时拒绝,此处.
  • acquire() never resolves/rejects if factory always rejects, here.
  • About the acquire function in pool.js, here.
  • .acquire() doesn't reject when resource creation fails, here.

祝你好运!

这篇关于node.js异步/等待或通用池导致无限循环?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆