在node.js和异步调用中使用辅助进程/后台进程 [英] Using worker/background processes in node.js vs async call

查看:121
本文介绍了在node.js和异步调用中使用辅助进程/后台进程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道将db或其他异步调用传递给一个或多个工作进程是否有任何好处.具体来说,我正在使用heroku和postgres.我已经阅读了有关node.js的大量知识,以及如何构造服务器,从而不会阻塞事件循环,并且智能体系结构不会使传入的请求挂起超过300毫秒左右.

I want to know if there is any benefit in passing off db or other async calls to a worker process or processes. Specifically I'm using heroku and postgres. I've read up a good bit on node.js and how to structure your server so that the event loop isn't blocked and that smart architecture doesn't leave incoming requests hanging longer than 300ms or so.

说我有以下内容:

 app.get('/getsomeresults/:query', function(request, response){
    var foo = request.params.query;
    pg.connect(process.env.DATABASE_URL, function(err, client, done) {
            client.query("SELECT * FROM users WHERE cat=$1", [foo], 
            function(err, result){
            //do some stuff with result.rows that may take 1000ms
            response.json({some:data})
            });
    });
 });

既然postgresql本质上是异步的,那么创建一个工作进程来处理来自初始db调用的结果集的处理是否有任何真正的好处?

Being that postgresql is async by nature is there any real benefit to creating a worker process to handle the processing of the results set from the initial db call?

推荐答案

在单独的进程中调用client.query不会给您真正的好处,因为将查询发送到服务器已经是node-pg中的异步操作.但是,真正的问题是回调函数的执行时间长.回调在主事件循环中同步运行,并阻止其他操作,因此最好将其设置为非阻止.

Calling client.query from a separate process won't give you a real benefit here, as sending queries to the server is already an asynchronous operation in node-pg. However, the real problem is the long execution time your callback function. The callback runs synchronously in the main event loop and blocks other operations, so it would be a good idea to make this non-blocking.

选项1:分叉子进程

每次执行回调时创建一个新进程并不是一个好主意,因为每个Node.js进程都需要自己的环境,这很费时间设置.相反,最好在启动服务器时创建多个服务器进程,并让它们同时处理请求.

Creating a new process every time the callback is executed is no good idea, since each Node.js process needs its own environment, which is time consuming to set up. Instead it would be better to create multiple server processes when the server is started and let them handle requests concurrently.

选项2:使用Node.js集群

幸运的是,Node.js提供了cluster接口来实现此目的.群集使您能够从一个主进程处理多个工作进程.它甚至支持连接池,因此您可以在每个子进程中简单地创建一个HTTP服务器,传入的请求将在它们之间自动分配(node-pg也支持池).

Luckily Node.js offers the cluster interface to achieve exactly this. Clusters give you the ability to handle multiple worker processes from one master process. It even supports connection pooling, so you can simply create a HTTP server in each child process an the incoming requests will be distributed among them automatically (node-pg supports pooling as well).

集群解决方案也很好,因为您无需为此在代码中进行太多更改.只需编写主流程代码并以工作人员身份启动现有代码即可.

The cluster solution is also nice, because you don't have to change a lot in your code for that. Just write the master process code and start your existing code as workers.

关于Node.js群集的官方文档很好地解释了群集的所有方面,所以我在这里不做详细介绍.只是一个可能的主代码的简短示例:

The official documentation on Node.js clusters explains all aspects if clusters very well, so I won't go into details here. Just a short example for a possible master code:

var cluster = require("cluster");
var os = require("os");
var http = require("http");

if (cluster.isMaster)
    master();
else
    worker();

function master() {
    console.info("MASTER "+process.pid+" starting workers");
    //Create a worker for each CPU core
    var numWorkers = os.cpus().length;
    for (var i = 0; i < numWorkers; i++)
        cluster.fork();
}

function worker() {
    //Put your existing code here
    console.info("WORKER "+process.pid+" starting http server");
    var httpd = http.createServer();
    //...
}

选项3:拆分结果处理

我认为回调函数执行时间长的原因是您必须处理大量结果行,并且没有机会以更快的方式处理结果.

I assume that the reason for the long execution time of the callback function is that you have to process a lot of result rows and that there is no chance to process the results in a faster way.

在这种情况下,使用process.nextTick()将处理分为几个块可能也是一个好主意.这些块将在几个事件循环帧中同步运行,但是可以在这些块之间执行其他操作(例如事件处理程序).这是一个粗略(未经测试)的代码示例:

In that case it might also be a good idea to split the processing into several chunks using process.nextTick(). The chunks will run synchronously in several event-loop frames, but other operations (like event-handlers) can be executed between these chunks. Here's a rough (and untested) scetch how the code could look like:

function(err, result) {
    var s, i;
    s = 0;
    processChunk();

    // process 100 rows in one frame
    function processChunk() {
        i = s;
        s += 100;
        while (i<result.rows.length && i<s) {
            //do some stuff with result.rows[i]
            i++;
        }
        if (i<result.rows.length)
            process.nextTick(processChunk);
        else
            //go on (send the response)
    }
}

我不确定100%,但是我认为node-pg提供了某种方式来接收查询结果,而不是整体上,而是分成几个部分.这将大大简化代码,因此搜索该方向可能是一个主意...

I'm not 100% sure, but I think node-pg offers some way to receive a query result not as a whole, but split into several chunks. This would simplify the code a lot, so it might be an idea to search into that direction...

最终结论

如果新请求仍然需要等待太长时间,我将首先使用选项2,另外使用选项3.

I would use option 2 in the first place and option 3 additionally, if new requests still have to wait too long.

这篇关于在node.js和异步调用中使用辅助进程/后台进程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆