Node js - http.request() 连接池问题 [英] Node js - http.request() problems with connection pooling

查看:20
本文介绍了Node js - http.request() 连接池问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑以下简单的 Node.js 应用程序:

Consider the following simple Node.js application:

var http = require('http');
http.createServer(function() { }).listen(8124); // Prevent process shutting down

var requestNo = 1;
var maxRequests = 2000;

function requestTest() {
    http.request({ host: 'www.google.com', method: 'GET' }, function(res) {
        console.log('Completed ' + (requestNo++));

        if (requestNo <= maxRequests) {
            requestTest();
        }
    }).end();
}

requestTest();

它一个接一个地向 google.com 发出 2000 个 HTTP 请求.问题是它开始请求 5 号并暂停大约 3 分钟,然后继续处理请求 6 - 10,然后再暂停 3 分钟,然后请求 11 - 15,暂停,依此类推. 我尝试将 www.google.com 更改为 localhost,这是一个运行我的机器的极其基本的 Node.js 应用程序,返回Hello world",我仍然得到 3 分钟的停顿.

It makes 2000 HTTP requests to google.com, one after the other. The problem is it gets to request No. 5 and pauses for about 3 mins, then continues processing requests 6 - 10, then pauses for another 3 minutes, then requests 11 - 15, pauses, and so on. I tried changing www.google.com to localhost, an extremely basic Node.js app running my machine that returns "Hello world", I still get the 3 minute pause.

现在我读到我可以增加连接池限制:

Now I read I can increase the connection pool limit:

http.globalAgent.maxSockets = 20;

现在,如果我运行它,它会处理请求 1 - 20,然后暂停 3 分钟,然后请求 21 - 40,然后暂停,依此类推.

Now if I run it, it processes requests 1 - 20, then pauses for 3 mins, then requests 21 - 40, then pauses, and so on.

最后,经过一番研究,我了解到我可以通过在请求选项中设置 agent: false 来完全禁用连接池:

Finally, after a bit of research, I learned I could disable connection pooling entirely by setting agent: false in the request options:

http.request({ host: 'www.google.com', method: 'GET', agent: false }, function(res) {
    ...snip....

...它会处理所有 2000 个请求就好了.

...and it'll run through all 2000 requests just fine.

我的问题是,这样做是个好主意吗?是否存在我可能会遇到过多 HTTP 连接的危险?为什么它会暂停 3 分钟,当然,如果我完成了连接,它应该直接将它添加回池中以备下一个请求使用,那么为什么要等待 3 分钟?原谅我的无知.

My question, is it a good idea to do this? Is there a danger that I could end up with too many HTTP connections? And why does it pause for 3 mins, surely if I've finished with the connection it should add it straight back into the pool ready for the next request to use, so why is it waiting 3 mins? Forgive my ignorance.

如果做不到这一点,那么对于 Node.js 应用发出潜在大量 HTTP 请求而不锁定或崩溃的最佳策略是什么?

Failing that, what is the best strategy for a Node.js app making a potentially large number HTTP requests, without locking up, or crashing?

我在 Mac OSX 10.8.2 上运行 Node.js 0.10 版.

I'm running Node.js version 0.10 on Mac OSX 10.8.2.

我发现如果我将上述代码转换为 for 循环并尝试同时建立一堆连接,我会在大约 242 个连接后开始出现错误.错误是:

I've found if I convert the above code into a for loop and try to establish a bunch of connections at the same time, I start getting errors after about 242 connections. The error is:

Error was thrown: connect EMFILE
(libuv) Failed to create kqueue (24)

...和代码...

for (var i = 1; i <= 2000; i++) {
    (function(requestNo) {
        var request = http.request({ host: 'www.google.com', method: 'GET', agent: false }, function(res) {
            console.log('Completed ' + requestNo);
        });

        request.on('error', function(e) {
            console.log(e.name + ' was thrown: ' + e.message);
        });

        request.end();
    })(i);
}

我不知道一个负载很重的 Node.js 应用程序是否能够达到这么多的同时连接.

I don't know if a heavily loaded Node.js app could ever reach that many simultaneous connections.

推荐答案

您必须使用响应.

请记住,在 v0.10 中,我们登陆了streams2.这意味着 data 事件在您开始寻找它们之前不会发生.所以,你可以做这样的事情:

Remember, in v0.10, we landed streams2. That means that data events don't happen until you start looking for them. So, you can do stuff like this:

http.createServer(function(req, res) {
  // this does some I/O, async
  // in 0.8, you'd lose data chunks, or even the 'end' event!
  lookUpSessionInDb(req, function(er, session) {
    if (er) {
      res.statusCode = 500;
      res.end("oopsie");
    } else {
      // no data lost
      req.on('data', handleUpload);
      // end event didn't fire while we were looking it up
      req.on('end', function() {
        res.end('ok, got your stuff');
      });
    }
  });
});

然而,当你不阅读时不会丢失数据的流的另一面是,如果你不阅读它实际上不会丢失数据!也就是说,它们开始时会暂停,您必须阅读它们才能获得任何信息.

However, the flip side of streams that don't lose data when you're not reading it, is that they actually don't lose data if you're not reading it! That is, they start out paused, and you have to read them to get anything out.

因此,在您的测试中发生的事情是您正在发出一堆请求并且不消耗响应,然后最终套接字被谷歌杀死,因为什么都没有发生,并且它假设你已经死了.

So, what's happening in your test is that you're making a bunch of requests and not consuming the responses, and then eventually the socket gets killed by google because nothing is happening, and it assumes you've died.

在某些情况下,不可能使用传入的消息:也就是说,如果您没有在请求上添加 response 事件处理程序,或者在服务器上完全编写并完成 response 消息,而无需读取请求.在这些情况下,我们只会为您将数据转储到垃圾箱中.

There are some cases where it's impossible to consume the incoming message: that is, if you don't add a response event handler on a requests, or where you completely write and finish the response message on a server without ever reading the request. In those cases, we just dump the data in the garbage for you.

但是,如果您正在侦听 'response' 事件,则您有责任处理该对象.在您的第一个示例中添加 response.resume(),您将看到它以合理的速度进行处理.

However, if you are listening to the 'response' event, it's your responsibility to handle the object. Add a response.resume() in your first example, and you'll see it processes on through at a reasonable pace.

这篇关于Node js - http.request() 连接池问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆