与浏览器相比,通过 Node.js 的 HTTP 请求延迟 [英] Delays in HTTP requests via Node.js compared to browser

查看:61
本文介绍了与浏览器相比,通过 Node.js 的 HTTP 请求延迟的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用 Node.js 通过 HTTP 请求查询一些公共 API.因此,我使用 request 模块.我正在测量我的应用程序中的响应时间,并看到我的应用程序从 API 查询返回的结果比通过 curl 或在浏览器中的直接"请求慢 2-3 倍.此外,我注意到与支持 HTTPS 的服务的连接通常比普通 HTTP 服务需要更长的时间,但这可能是巧合.

In using Node.js to query some public APIs via HTTP requests. Therefore, I'm using the request module. I'm measuring the response time within my application, and see that my application return the results from API queries about 2-3 times slower than "direct" requests via curl or in the browser. Also, I noticed that connections to HTTPS enabled services usually take longer than plain HTTP ones, but this can be a coincidence.

我尝试优化我的 request 选项,但无济于事.比如我查询

I tried to optimize my request options, but to no avail. For example, I query

https://www.linkedin.com/countserv/count/share?url=http%3A%2F%2Fwww.google.com%2F&lang=en_US

我使用 request.defaults 来设置所有请求的整体默认值:

I'm using request.defaults to set the overall defaults for all requests:

var baseRequest = request.defaults({
    pool: {maxSockets: Infinity},
    jar: true,
    json: true,
    timeout: 5000,
    gzip: true,
    headers: {
        'Content-Type': 'application/json'
    }
});

实际请求是通过

...
var start = new Date().getTime();

var options = {
    url: 'https://www.linkedin.com/countserv/count/share?url=http%3A%2F%2Fwww.google.com%2F&lang=en_US',
    method: 'GET'
};

baseRequest(options, function(error, response, body) {

    if (error) {
        console.log(error);
    } else {
        console.log((new Date().getTime()-start) + ": " + response.statusCode);
    }

});

有人看到优化潜力吗?我做错了什么吗?提前感谢您的任何建议!

Does anybody see optimization potential? Am I doing something completely wrong? Thanks in advance for any advice!

推荐答案

根据我对您的架构的了解,您需要解决几个潜在问题.它们没有特定的顺序:

There are several potential issues you'll need to address given what I understand from your architecture. In no particular order they are:

  • 使用 request 总是比直接使用 http 慢,因为正如智者曾经说过的:抽象成本".;) 事实上,为了尽可能提高性能,我会直接使用 node 的 net 模块处理所有 HTTP 请求.对于 HTTPS,不值得重写 https 模块.根据记录,HTTPS 总是比 HTTP 慢,因为需要握手加密密钥和对有效负载执行加密/解密工作.
  • 如果您的要求包括从任何单个服务器检索多个资源,请确保使用 http KeepAlive 集按顺序发出这些请求,以便您可以从已经打开的套接字中受益.与在已打开的套接字上发出请求相比,握手新 TCP 套接字所需的时间巨大.
  • 确保禁用 http 连接池(请参阅 Nodejs 最大套接字池设置)
  • 确保您的操作系统和 shell 没有限制可用套接字的数量.有关提示,请参阅可能有多少套接字连接?.
  • 如果您使用的是 linux,请查看 Increasinglinux 中的最大 tcp/ip 连接数,我还强烈建议微调内核套接字缓冲区.
  • Using request will always be slower than using http directly since as the wise man once said: "abstraction costs". ;) In fact, to squeeze out every possible ounce of performance, I'd handle all HTTP requests using node's net module directly. For HTTPS, it's not worth rewriting the https module. And for the record, HTTPS will always be slower than HTTP by definition due to both the need to handshake cryptographic keys and do the crypt/decrypt work on the payload.
  • If your requirements include retrieving more than one resource from any single server, assure that those requests are made in order with the http KeepAlive set so you can benefit from the already open socket. The time it takes to handshake a new TCP socket is huge compared to making a request on an already open socket.
  • assure that http connection pooling is disabled (see Nodejs Max Socket Pooling Settings)
  • assure that your operating system and shell is not limiting the number of available sockets. See How many socket connections possible? for hints.
  • if you're using linux, check Increasing the maximum number of tcp/ip connections in linux and I'd also strongly recommend fine tuning the kernel socket buffers.

我会在想到时添加更多建议.

I'll add more suggestions as they occur to me.

有关对同一端点的多个请求的更多信息:

More on the topic of multiple requests to the same endpoint:

如果您需要从同一端点检索大量资源,将您的请求分段到与该端点保持开放连接的特定工作人员会很有用.这样,您就可以放心,您可以尽快获得请求的资源,而不会产生初始 TCP 握手的开销.

If you need to retrieve a number of resources from the same endpoint, it would be useful to segment your requests to specific workers that maintain open connections to that endpoint. In that way, you can be assured that you can get the requested resource as quickly as possible without the overhead of the initial TCP handshake.

TCP 握手是一个三个阶段的过程.

TCP handshake is a three-stage process.

第一步:客户端向远程服务器发送一个SYN包.第二步:远程服务器用SYN+ACK回复客户端.第三步:客户端用ACK回复远程服务器.

Step one: client sends a SYN packet to the remote server. Step two: the remote server replies to the client with a SYN+ACK. Step three: the client replies to the remote server with an ACK.

根据客户端到远程服务器的延迟,这可能加起来(正如 William Proxmire 曾经说过的)真钱",或者在这种情况下,延迟.

Depending on the client's latency to the remote server, this can add up to (as William Proxmire once said) "real money", or in this case, delay.

在我的桌面上,发送到 www.google.com 的 2K 八位字节数据包的当前延迟(通过 ping 测量的往返时间)介于 37 到 227 毫秒之间.

From my desktop, the current latency (round-trip time measure by ping) for a 2K octet packet to www.google.com is anywhere between 37 and 227ms.

因此假设我们可以依赖 95ms 的往返平均值(通过完美连接),初始 TCP 握手的时间将约为 130ms 或 SYN(45ms) + SYN+ACK(45ms) + ACK(45 毫秒),这是建立初始连接的十分之一秒.

So assuming that we can rely on a round-trip mean of 95ms (over a perfect connection), the time for the initial TCP handshake would be around 130ms or SYN(45ms) + SYN+ACK(45ms) + ACK(45ms) and this is a tenth of a second just to establish the initial connection.

如果连接需要重新传输,则可能需要更长的时间.

If the connection requires retransmission, it could take much longer.

这里假设您通过新的 TCP 连接检索单个资源.

And this is assuming you retrieve a single resource over a new TCP connection.

为了改善这种情况,我会让你的工作人员保持一个到已知"目的地的开放连接池,然后他们会向主管进程通告这些目的地,以便它可以将请求定向到具有实时"连接的负载最少的服务器到目标服务器.

To ameliorate this, I'd have your workers keep a pool of open connections to "known" destinations which they would then advertise back to the supervisor process so it could direct requests to the least loaded server with a "live" connection to the target server.

这篇关于与浏览器相比,通过 Node.js 的 HTTP 请求延迟的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆