在 Nodejs/Request/MongoDB 中调用过多 Promise 时出现内存泄漏 [英] Memory leak when calling too many promises in Nodejs/Request/MongoDB

查看:81
本文介绍了在 Nodejs/Request/MongoDB 中调用过多 Promise 时出现内存泄漏的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我尝试在 NodeJS 中调用多达 200,000 个 POST 请求时,它显示一些错误,如堆内存泄漏.

When I tried to call up to 200,000 POST requests in NodeJS, it display some errors like heap memory leak.

在每个 POST 请求中,我想将解析的数据插入到 localhost mongo DB 中.

In each POST request, I want to insert the resolved data into localhost mongo DB.

一次发出2000个请求是可以的,但是处理200,000个请求真的很难.我被这个问题困住了,不知道如何解决它.

It's ok to make 2000 requests at one time but it's really difficult to deal with 200,000 requests. I got stuck in this problem and don't know exactly to resolve it.

我真的需要你的帮助或任何建议.

I really need your help or any suggestions.

预先感谢您的帮助.

    const mongoose = require('mongoose');
    const request = require('request');

    // DB connection
    mongoose
        .connect("mongodb://localhost:27017/test?retryWrites=true&w=majority", { useNewUrlParser: true, useUnifiedTopology: true })
        .then(() => console.log('Connected!'))
        .catch(err => console.error('Could not connect...', err));

    // Initialize Mongoose 's model
    const Sample = mongoose.model(
        'Sample',
        new mongoose.Schema({}, { strict: false, versionKey: false }),
        'sample_collection'
    );

    // Insert data into Sample
    var insertDataIntoSample = function (means) {
        Sample.collection.insert(means, { ordered: false });
    }

    // HTTP POST request to get data
    const getDataFromInternet = function (param) {
        return new Promise((resolve, reject) => {
            request.post(
                'https://url-to-post-data.com/',
                { json: { 'query': param } },
                function (error, response, body) {
                    if (!error && response.statusCode == 200 && body) {
                        insertDataIntoSample(body.data);
                        resolve(param);
                    }
                }
            );
        });
    };

    // Call up to 200,000 requests
    var myParams = [...] // 200,000 elements
    for (var i = 0; i < myParams.length; i++) {
        getDataFromInternet(myParams[i]).then(function (data) {
            console.log(data)
        })
    }

推荐答案

因此,一次向数据库提交 200,000 个请求只会适得其反.无论如何,您的数据库实际上无法一次处理多个请求,因此您通过同时处理这么多请求所做的一切只会导致大量的峰值内存使用量.

So, it's just downright counterproductive to submit 200,000 requests at a time to your database. There's no way your database can actually work on more than a few requests at a time anyway so all you're doing by putting that many requests in flight at the same time is just causing an enormous amount of peak memory usage.

通过一些测试,您会发现大约有多少并发请求仍然有效,这完全取决于您的数据库及其配置.一个大型的铁数据库服务器,可能可以访问大量的 CPU/线程,甚至可能有一些有效的磁盘分区,并且能够一次处理大量请求.一次只发出几个请求后,较小的配置可能不会获得任何好处.

With a little testing, you would figure out approximately how many simultaneous requests are still efficient and it depends entirely upon your database and its configuration. A big iron database server, might have access to lots of CPUs/threads and maybe even some efficient disk partitioning and be able to make progress on a number of requests at a time. A smaller configuration might not gain anything after just a couple requests in flight at a time.

在 stackoverflow 和其他地方有几十个选项可以在处理数组的同时进行异步函数调用,并执行此操作,以便同时只有 N 个请求在进行中.这可能是您在这里想要的一般概念.Bluebird 和 Async-Promises 等库具有用于管理并发访问的内置函数.

There are several dozen options here on stackoverflow and elsewhere for making an asynchronous function call while processing an array and doing it so that only N requests are in flight at the same time. That's probably the general concept you want here. Libraries such as Bluebird and Async-Promises have functions built-in to manage concurrent access.

我最喜欢的一个(只是一个你可以复制的函数)叫做 mapConcurrent().您向它传递数组、一次希望进行的最大请求数以及它将为数组中的每个项目调用的返回承诺函数.

One of my simple favorites (just a function you can copy) is called mapConcurrent(). You pass it the array, the max number of requests you want in progress at a time and a promise-returning function that it will call for every item in the array.

您对自己的配置进行实验以查看 maxConcurrent 的最佳值是多少(提示,它可能是一个相当小的数字,例如 10 以下).

You run experiments with your configuration to see what the optimal value for maxConcurrent is (hint, it's probably a fairly small number like under 10).

// takes an array of items and a function that returns a promise
function mapConcurrent(items, maxConcurrent, fn) {
    let index = 0;
    let inFlightCntr = 0;
    let doneCntr = 0;
    let results = new Array(items.length);
    let stop = false;

    return new Promise(function(resolve, reject) {

        function runNext() {
            let i = index;
            ++inFlightCntr;
            fn(items[index], index++).then(function(val) {
                ++doneCntr;
                --inFlightCntr;
                results[i] = val;
                run();
            }, function(err) {
                // set flag so we don't launch any more requests
                stop = true;
                reject(err);
            });
        }

        function run() {
            // launch as many as we're allowed to
            while (!stop && inflightCntr < maxConcurrent && index < items.length) {
                runNext();
            }
            // if all are done, then resolve parent promise with results
            if (doneCntr === items.length) {
                resolve(results);
            }
        }

        run();
    });
}

此答案中还提到了其他一些选项批处理异步操作.

There are some other options mentioned in this answer Batching asynchronous operations.

这篇关于在 Nodejs/Request/MongoDB 中调用过多 Promise 时出现内存泄漏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆