NodeJs/Express:避免顺序处理许多文件 [英] NodeJs / Express: avoid sequential processing of many files

查看:26
本文介绍了NodeJs/Express:避免顺序处理许多文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Express Webhook,它很少被称为:

I have an Express webhook that gets rarely called:

app.use('/convert', async (req, res) => {
  const files = await getFiles();
  for(let file of files) {
    await download(file);
    await convert(file);
    await upload(file)
  }
  res.send('finished');

}

循环的每次迭代都需要几分钟,因此可能需要处理数百个文件.如何避免在这里进行顺序处理?

Each iteration of the loop takes a few minutes, there may be hundreds of files to process. How can I avoid sequential processing here?

非常感谢

推荐答案

最简单的方法是同时处理所有内容. Promise 规范中有一些同时处理多个Promise的方法,为此,我们要使用 Promise.all .

The easiest thing to do would be to process everything concurrently. The Promise specification has some methods for working with multiple promises simultaneously, for this we'll want to use Promise.all.

app.use('/convert', async (req, res) => {
    const files = await getFiles();
    const promises = files.map(async (file) => {
        await download(file);
        await convert(file);
        await upload(file)
    });
    await Promise.all(promises);
    res.send('finished');
}

尽管一次完成所有操作相对简单,但可能会占用大量资源.目前尚不清楚 download convert upload 在内部如何工作,但很有可能会达到计算机资源的极限.为避免达到打开文件限制或内存不足之类的条件,应该限制同时处理的项目数.

Although doing everything at once is relatively simple, it can be very resource heavy. It's unclear how download, convert, and upload work internally but it's very possible that you could hit the limit of the machine resources. To avoid things like hitting the open file limit, or running out of memory there should be a limit to the number of items being processed concurrently.

一种方法是分批处理项目.要批量处理,您可以简单地将 files 的数组拆分为多个块,然后将上述解决方案与您的迭代解决方案结合起来.

One way is to process items in batches. To process in batches you could simply split the array of files into chunks and combine the solution above with your iterative solution.

app.use('/convert', async (req, res) => {
    const files = await getFiles();

    const chunkSize = 5;
    const chunks = [];
    while (files.length) {
        chunks.push(files.splice(0, chunkSize));
    }

    for (const chunk of chunks) {
        const promises = chunk.map(async (file) => {
            await download(file);
            await convert(file);
            await upload(file)
        });
        await Promise.all(promises);
    }
    res.send('finished');
});

上面的实现将等待 chunkSize 项完成处理,然后再排队另一个 chunkSize 项进行处理.因为它等待所有项目完成,所以某些项目可能会非常快速地处理,而其他项目可能会花费更长的时间.在这种情况下,您最终将无法利用您的资源.理想情况下,您始终总是一次处理 chunkSize 个项目.为此,您可以将 chunkSize "threads"排队为了处理,每个线程"都被处理.一次只能处理一项,直到没有剩余要处理的东西为止.

The implementation above will wait for chunkSize items to finish processing before queuing up another chunkSize items for processing. Because it waits for all items to finish it's possible that some of the items process very quickly, but others take much longer. In this case you end up under utilizing your resources. Ideally you would always be processing chunkSize items at a time. To do this you can queue up chunkSize "threads" for processing, each "thread" will process one item at a time until there is nothing left to process.

async function process(file) {
    await download(file);
    await convert(file);
    await upload(file);
}

async function thread(files) {
    while (files.length) {
        await process(files.pop());
    }
}

app.use('/convert', async (req, res) => {
    const files = await getFiles();

    let maxConcurrency = 5;

    const threads = [];
    while (--maxConcurrency) {
        threads.push(thread(files));
    }
    await Promise.all(threads);

    res.send('finished');
});

这篇关于NodeJs/Express:避免顺序处理许多文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆