启用TCP_NODELAY的Linux回送性能 [英] Linux Loopback performance with TCP_NODELAY enabled

查看:811
本文介绍了启用TCP_NODELAY的Linux回送性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近在运行一些性能测试时发现了一个有趣的TCP性能问题,该性能测试将网络性能与环回性能进行了比较.在我的情况下,网络性能超过了环回性能(1Gig网络,相同的子网).在我要处理延迟的情况下,至关重要的是,因此启用了TCP_NODELAY.我们提出的最好的理论是TCP拥塞控制正在阻止数据包.我们进行了一些数据包分析,我们可以肯定地看到正在保存数据包,但是原因并不明显.现在的问题...

I recently stumbled on an interesting TCP performance issue while running some performance tests that compared network performance versus loopback performance. In my case the network performance exceeded the loopback performance (1Gig network, same subnet). In the case I am dealing latencies are crucial, so TCP_NODELAY is enabled. The best theory that we have come up with is that TCP congestion control is holding up packets. We did some packet analysis and we can definitely see that packets are being held, but the reason is not obvious. Now the questions...

1)在什么情况下以及为什么通过环回进行通信会比通过网络慢?

1) In what cases, and why, would communicating over loopback be slower than over the network?

2)当发送速度尽可能快时,为什么切换TCP_NODELAY对回送的最大吞吐量的影响比对网络的影响大?

2) When sending as fast as possible, why does toggling TCP_NODELAY have so much more of an impact on maximum throughput over loopback than over the network?

3)我们如何检测和分析TCP拥塞控制作为性能低下的潜在原因?

3) How can we detect and analyze TCP congestion control as a potential explanation for the poor performance?

4)关于这种现象的原因,还有人有其他理论吗?如果可以,有什么方法可以证明理论?

4) Does anyone have any other theories as to the reason for this phenomenon? If yes, any method to prove the theory?

以下是一些简单的点对点c ++应用程序生成的示例数据:

Here is some sample data generated by a simple point to point c++ app:


Transport     Message Size (bytes)  TCP NoDelay   Send Buffer (bytes)   Sender Host   Receiver Host   Throughput (bytes/sec)  Message Rate (msgs/sec)
TCP           128                   On            16777216              HostA         HostB           118085994                922546
TCP           128                   Off           16777216              HostA         HostB           118072006                922437
TCP           128                   On                4096              HostA         HostB            11097417                 86698
TCP           128                   Off               4096              HostA         HostB            62441935                487827
TCP           128                   On            16777216              HostA         HostA            20606417                160987
TCP           128                   Off           16777216              HostA         HostA           239580949               1871726
TCP           128                   On                4096              HostA         HostA            18053364                141041
TCP           128                   Off               4096              HostA         HostA           214148304               1673033
UnixStream    128                   -             16777216              HostA         HostA            89215454                696995
UnixDatagram  128                   -             16777216              HostA         HostA            41275468                322464
NamedPipe     128                   -             -                     HostA         HostA            73488749                574130

以下是一些有用的信息:

Here are a few more pieces of useful information:

  • 我只看到很小的这个问题 消息
  • HostA和HostB都相同 硬件套件(至强X5550@2.67GHz,总共32核/128 Gig Mem/1Gig Nics)
  • OS是RHEL 5.4内核2.6.18-164.2.1.el5)
  • I only see this issue with small messages
  • HostA and HostB both have the same hardware kit (Xeon X5550@2.67GHz, 32 cores total/128 Gig Mem/1Gig Nics)
  • OS is RHEL 5.4 kernel 2.6.18-164.2.1.el5)

谢谢

推荐答案

1)在什么情况下以及为什么,通过环回进行通信会比通过网络慢?

Loopback将计算tx + rx的数据包设置+ tcp chksum放在同一台计算机上,因此它需要做2倍的处理,而在2台计算机上,您将tx/rx分配在它们之间.这可能会对环回产生负面影响.

Loopback puts the packet setup+tcp chksum calculation for both tx+rx on the same machine, so it needs to do 2x as much processing, while with 2 machines you split the tx/rx between them. This can have negative impact on loopback.

2)当发送速度尽可能快时,为什么切换 TCP_NODELAY 对环回最大吞吐量的影响比对网络的影响大?

2) When sending as fast as possible, why does toggling TCP_NODELAY have so much more of an impact on maximum throughput over loopback than over the network?

不确定如何得出此结论,但是环回vs网络的实现方式非常不同,如果尝试将其推到极限,则会遇到不同的问题.环回接口(如对1的答复中所述)会导致同一台计算机上的tx + rx处理开销.另一方面,NIC在其循环缓冲区等中可以容纳多少未完成的数据包方面有一定数量的限制,这将导致完全不同的瓶颈(而且各个芯片之间甚至在不同设备之间的切换方面,差异也很大)他们)

Not sure how you've come to this conclusion, but the loopback vs network are implemented very differently, and if you try to push them to the limit, you will hit different issues. Loopback interfaces (as mentioned in answer to 1) cause tx+rx processing overhead on the same machine. On the other hand, NICs have a # of limits in terms of how many outstanding packets they can have in their circular buffers etc which will cause completely different bottlenecks (and this varies greatly from chip to chip too, and even from the switch that's between them)

3)我们如何检测和分析TCP拥塞控制作为性能下降的潜在解释?

仅在丢包时才开始拥塞控制.您看到丢包了吗?否则,您可能会在TCP窗口大小与网络延迟因素方面达到极限.

Congestion control only kicks in if there is packet loss. Are you seeing packet loss? Otherwise, you're probably hitting limits on the tcp window size vs network latency factors.

4)有人对此现象的原因有其他理论吗?如果可以,有什么方法可以证明理论?

我不明白您在这里提到的现象.我在表中看到的就是您有一些带有大发送缓冲区的套接字-这可能是完全合法的.在快速的计算机上,您的应用程序肯定能够生成比网络可以泵出的数据更多的数据,所以我不确定您在这里将什么归类为问题.

I don't understand the phenomenon you refer to here. All I see in your table is that you have some sockets with a large send buffer - this can be perfectly legitimate. On a fast machine, your application will certainly be capable of generating more data than the network can pump out, so I'm not sure what you're classifying as a problem here.

最后一点:由于各种原因,小消息会给您的网络带来更大的性能影响,例如:

One final note: small messages create a much bigger performance hit on your network for various reasons, such as:

  • 每个数据包的开销是固定的(对于mac + ip + tcp标头),有效负载越小,您将拥有的开销就越大.
  • 许多NIC的限制与未处理数据包的数量有关,这意味着使用较小的数据包时,您将遇到数据量少的NIC瓶颈.
  • 网络本身是每个数据包的开销,因此可以通过网络泵送的最大数据量再次取决于数据包的大小.

这篇关于启用TCP_NODELAY的Linux回送性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆