与浏览器相比,通过Node.js延迟HTTP请求 [英] Delays in HTTP requests via Node.js compared to browser

查看:241
本文介绍了与浏览器相比,通过Node.js延迟HTTP请求的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用Node.js通过HTTP请求查询某些公共API。因此,我正在使用请求模块。我正在测量我的应用程序中的响应时间,并看到我的应用程序从API查询返回的结果比通过curl或浏览器中的直接请求慢2-3倍。此外,我注意到与启用HTTPS的服务的连接通常比普通的HTTP服务更长,但这可能是巧合。



我试图优化我的请求选项,但无济于事。例如,我查询



https://www.linkedin.com/countserv/count/share?url=http%3A%2F%2Fwww.google.com%2F&lang = en_US



我正在使用 request.defaults 来设置所有请求的总体默认值:

  var baseRequest = request.defaults({
pool:{maxSockets:Infinity},
jar: true,
json:true,
timeout:5000,
gzip:true,
header:{
'Content-Type':'application / json'
}
});

实际请求是通过



<$ p完成的$ p> ...
var start = new Date()。getTime();

var options = {
url:'https://www.linkedin.com/countserv/count/share?url=http%3A%2F%2Fwww.google.com%2F& ; lang = en_US',
方法:'GET'
};

baseRequest(选项,函数(错误,响应,正文){

if(error){
console.log(error);
} else {
console.log((new date()。getTime() - start)+:+ response.statusCode);
}

});

有人看到优化潜力吗?我做错了什么吗?提前感谢任何建议!

解决方案

根据您对架构的理解,您需要解决几个潜在问题。它们没有特别的顺序:




  • 使用请求总是比使用慢 http 直接因为智者曾经说过:抽象成本。 ;)事实上,为了挤出每一个可能的性能,我将直接使用节点的 net 模块处理所有HTTP请求。对于HTTPS,不值得重写 https 模块。根据定义,由于需要握手加密密钥并对有效负载执行加密/解密工作,HTTPS总是比HTTP慢。

  • 如果您的要求包括检索来自任何单个服务器的多个资源,确保这些请求是按顺序进行的,并设置了http KeepAlive,以便您可以从已经打开的套接字中受益。与在已经打开的套接字上发出请求相比,握手新TCP套接字所花费的时间是 huge

  • 确保禁用http连接池(请参阅 Nodejs Max Socket池设置

  • 保证您的操作系统和shell不限制可用套接字的数量。有关提示,请参见可以使用多少个套接字连接?

  • 如果您使用的是Linux,请查看增加linux中tcp / ip连接的最大数量,我也强烈建议微调内核套接字缓冲区。



我会向他添加更多建议。



更新



有关多个主题的更多信息对同一端点的请求:



如果需要从同一端点检索大量资源,将请求分段给维护打开连接的特定工作人员会很有用到那个端点。通过这种方式,您可以放心,您可以尽快获得所请求的资源,而无需初始TCP握手的开销。



TCP握手是一个三阶段进程。



第一步:客户端向远程服务器发送SYN数据包。
第二步:远程服务器使用SYN + ACK回复客户端。
第三步:客户端使用ACK回复远程服务器。



根据客户端对远程服务器的延迟,这可能会增加(如William Proxmire曾经说过真钱,或者在这种情况下,延迟。



从我的桌面,当前的延迟(通过ping测量的往返时间)到www.google.com的2K八位字节数据包在37到227ms之间。



假设我们可以依赖95ms的往返平均值(通过完美的连接) ),初始TCP握手的时间约为130ms或SYN(45ms)+ SYN + ACK(45ms)+ ACK(45ms),这只是建立初始连接的十分之一秒。



如果连接需要重新传输,则可能需要更长时间



这是假设你检索新TCP连接上的单个资源。



为了改善这一点,我让你的工作人员保持与已知目的地的开放连接池w然后,他们会向主管流程做广告,以便它可以通过与目标服务器的实时连接将请求定向到负载最小的服务器。


In using Node.js to query some public APIs via HTTP requests. Therefore, I'm using the request module. I'm measuring the response time within my application, and see that my application return the results from API queries about 2-3 times slower than "direct" requests via curl or in the browser. Also, I noticed that connections to HTTPS enabled services usually take longer than plain HTTP ones, but this can be a coincidence.

I tried to optimize my request options, but to no avail. For example, I query

https://www.linkedin.com/countserv/count/share?url=http%3A%2F%2Fwww.google.com%2F&lang=en_US

I'm using request.defaults to set the overall defaults for all requests:

var baseRequest = request.defaults({
    pool: {maxSockets: Infinity},
    jar: true,
    json: true,
    timeout: 5000,
    gzip: true,
    headers: {
        'Content-Type': 'application/json'
    }
});

The actual request are done via

...
var start = new Date().getTime();

var options = {
    url: 'https://www.linkedin.com/countserv/count/share?url=http%3A%2F%2Fwww.google.com%2F&lang=en_US',
    method: 'GET'
};

baseRequest(options, function(error, response, body) {

    if (error) {
        console.log(error);
    } else {
        console.log((new Date().getTime()-start) + ": " + response.statusCode);
    }

});

Does anybody see optimization potential? Am I doing something completely wrong? Thanks in advance for any advice!

解决方案

There are several potential issues you'll need to address given what I understand from your architecture. In no particular order they are:

  • Using request will always be slower than using http directly since as the wise man once said: "abstraction costs". ;) In fact, to squeeze out every possible ounce of performance, I'd handle all HTTP requests using node's net module directly. For HTTPS, it's not worth rewriting the https module. And for the record, HTTPS will always be slower than HTTP by definition due to both the need to handshake cryptographic keys and do the crypt/decrypt work on the payload.
  • If your requirements include retrieving more than one resource from any single server, assure that those requests are made in order with the http KeepAlive set so you can benefit from the already open socket. The time it takes to handshake a new TCP socket is huge compared to making a request on an already open socket.
  • assure that http connection pooling is disabled (see Nodejs Max Socket Pooling Settings)
  • assure that your operating system and shell is not limiting the number of available sockets. See How many socket connections possible? for hints.
  • if you're using linux, check Increasing the maximum number of tcp/ip connections in linux and I'd also strongly recommend fine tuning the kernel socket buffers.

I'll add more suggestions as they occur to me.

Update

More on the topic of multiple requests to the same endpoint:

If you need to retrieve a number of resources from the same endpoint, it would be useful to segment your requests to specific workers that maintain open connections to that endpoint. In that way, you can be assured that you can get the requested resource as quickly as possible without the overhead of the initial TCP handshake.

TCP handshake is a three-stage process.

Step one: client sends a SYN packet to the remote server. Step two: the remote server replies to the client with a SYN+ACK. Step three: the client replies to the remote server with an ACK.

Depending on the client's latency to the remote server, this can add up to (as William Proxmire once said) "real money", or in this case, delay.

From my desktop, the current latency (round-trip time measure by ping) for a 2K octet packet to www.google.com is anywhere between 37 and 227ms.

So assuming that we can rely on a round-trip mean of 95ms (over a perfect connection), the time for the initial TCP handshake would be around 130ms or SYN(45ms) + SYN+ACK(45ms) + ACK(45ms) and this is a tenth of a second just to establish the initial connection.

If the connection requires retransmission, it could take much longer.

And this is assuming you retrieve a single resource over a new TCP connection.

To ameliorate this, I'd have your workers keep a pool of open connections to "known" destinations which they would then advertise back to the supervisor process so it could direct requests to the least loaded server with a "live" connection to the target server.

这篇关于与浏览器相比,通过Node.js延迟HTTP请求的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆