实施良好的“发送"使用 TCP 排队 [英] Implement a good performing "to-send" queue with TCP

查看:27
本文介绍了实施良好的“发送"使用 TCP 排队的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了不使远程端点泛滥,我的服务器应用程序必须实现我希望发送的数据包的发送"队列.

In order not to flood the remote endpoint my server app will have to implement a "to-send" queue of packets I wish to send.

我使用 Windows Winsock、I/O 完成端口.

I use Windows Winsock, I/O Completion Ports.

所以,我知道当我的代码调用socket->send(.....)"时,我的自定义send()"函数将检查数据是否已经在线"(朝着那个方向)插座).
如果数据确实在线上,它将简单地将数据排入队列以便稍后发送.
如果线路上没有数据,它将调用 WSASend() 来真正发送数据.

So, I know that when my code calls "socket->send(.....)" my custom "send()" function will check to see if a data is already "on the wire" (towards that socket).
If a data is indeed on the wire it will simply queue the data to be sent later.
If no data is on the wire it will call WSASend() to really send the data.

到目前为止一切都很好.

So far everything is nice.

现在,我要发送的数据的大小是不可预测的,所以我将它分成更小的块(比如 64 字节),以免为小数据包浪费内存,并排队/发送这些小块.

Now, the size of the data I'm going to send is unpredictable, so I break it into smaller chunks (say 64 bytes) in order not to waste memory for small packets, and queue/send these small chunks.

当 IOCP 给出关于我发送的数据包的写入完成"完成状态时,我发送队列中的下一个数据包.

When a "write-done" completion status is given by IOCP regarding the packet I've sent, I send the next packet in the queue.

这就是问题所在;速度非常低.我实际上得到了,它在本地连接 (127.0.0.1) 上的速度像 200kb/s.

That's the problem; The speed is awfully low. I'm actually getting, and it's on a local connection (127.0.0.1) speeds like 200kb/s.

所以,我知道我必须用几个块(WSABUF 对象数组)调用 WSASend(),这将提供更好的性能,但是,我一次发送多少?
有推荐的字节大小吗?我确定答案特定于我的需求,但我也确定有一些一般"点可以开始.
有没有其他更好的方法来做到这一点?

So, I know I'll have to call WSASend() with seveal chunks (array of WSABUF objects), and that will give much better performance, but, how much will I send at once?
Is there a recommended size of bytes? I'm sure the answer is specific to my needs, yet I'm also sure there is some "general" point to start with.
Is there any other, better, way to do this?

推荐答案

当然,如果您试图以比对等方处理数据的速度更快的速度发送数据(由于链接速度或对等方可以读取和处理数据的速度).那么如果你想控制正在使用的系统资源量,你只需要求助于你自己的数据队列.如果您只有几个连接,那么这很可能都是不必要的,如果您有 1000 个连接,那么您需要关注这一点.这里要意识到的主要事情是,如果您在 Windows 上使用任何异步网络发送 API,无论是托管的还是非托管的,那么您都将发送缓冲区的生命周期控制权交给接收应用程序和网络.请参阅此处了解更多信息细节.

Of course you only need to resort to providing your own queue if you are trying to send data faster than the peer can process it (either due to link speed or the speed that the peer can read and process the data). Then you only need to resort to your own data queue if you want to control the amount of system resources being used. If you only have a few connections then it is likely that this is all unnecessary, if you have 1000s then it's something that you need to be concerned about. The main thing to realise here is that if you use ANY of the asynchronous network send APIs on Windows, managed or unmanaged, then you are handing control over the lifetime of your send buffers to the receiving application and the network. See here for more details.

一旦您决定确实需要为此烦恼,那么您就不必总是烦恼了,如果对等方处理数据的速度比您生成数据的速度快,则无需通过排队来减慢速度发送方.您将看到您需要对数据进行排队,因为您的写入完成将开始花费更长的时间,因为您发出的重叠写入无法完成,因为 TCP 堆栈由于流量控制问题而无法发送更多数据(请参阅 http://www.tcpipguide.com/free/t_TCPWindowSizeAdjustmentandFlowControl.htm).在这一点上,您可能正在使用数量不受限制的有限系统资源(非分页池内存和可以锁定的内存页面数量都是有限的,并且(据我所知)都被挂起的套接字写入使用)...

And once you have decided that you DO need to bother with this you then don't always need to bother, if the peer can process the data faster than you can produce it then there's no need to slow things down by queuing on the sender. You'll see that you need to queue data because your write completions will begin to take longer as the overlapped writes that you issue cannot complete due to the TCP stack being unable to send any more data due to flow control issues (see http://www.tcpipguide.com/free/t_TCPWindowSizeAdjustmentandFlowControl.htm). At this point you are potentially using an unconstrained amount of limited system resources (both non-paged pool memory and the number of memory pages that can be locked are limited and (as far as I know) both are used by pending socket writes)...

无论如何,足够了...我假设您在添加发送队列之前已经实现了良好的吞吐量?为了获得最佳性能,您可能需要将 TCP 窗口大小设置为大于默认值(请参阅 http://msdn.microsoft.com/en-us/library/ms819736.aspx) 并在连接上发布多个重叠写入.

Anyway, enough of that... I assume you already have achieved good throughput before you added your send queue? To achieve maximum performance you probably need to set the TCP window size to something larger than the default (see http://msdn.microsoft.com/en-us/library/ms819736.aspx) and post multiple overlapped writes on the connection.

假设您已经拥有良好的吞吐量,那么您需要在开始排队之前允许一些挂起的重叠写入,这样可以最大限度地增加准备发送的数据量.一旦您拥有未完成的待处理写入的神奇数量,您就可以开始将数据排队,然后根据后续完成情况发送它.当然,只要您有任何数据排队,所有进一步的数据都必须排队.使数量可配置并进行配置,以了解在速度和使用的资源(即您可以维护的并发连接数)之间进行权衡时哪种方式最有效.

Assuming you already HAVE good throughput then you need to allow a number of pending overlapped writes before you start queuing, this maximises the amount of data that is ready to be sent. Once you have your magic number of pending writes outstanding you can start to queue the data and then send it based on subsequent completions. Of course, as soon as you have ANY data queued all further data must be queued. Make the number configurable and profile to see what works best as a trade off between speed and resources used (i.e. number of concurrent connections that you can maintain).

我倾向于将整个数据缓冲区排入队列,这些缓冲区将作为数据缓冲区队列中的单个条目发送,因为您使用的是 IOCP,因此这些数据缓冲区很可能已经被引用计数以使其易于释放然后当完成发生而不是之前,因此排队过程变得更简单,因为您只需在数据在队列中时持有对发送缓冲区的引用,并在您发出发送后释放它.

I tend to queue the whole data buffer that is due to be sent as a single entry in a queue of data buffers, since you're using IOCP it's likely that these data buffers are already reference counted to make it easy to release then when the completions occur and not before and so the queuing process is made simpler as you simply hold a reference to the send buffer whilst the data is in the queue and release it once you've issued a send.

就我个人而言,我不会通过使用具有多个 WSABUF 的分散/收集写入进行优化,直到您使基础工作并且您知道这样做实际上会提高性能,如果您已经有足够的数据待处理,我怀疑它会这样做;但和往常一样,衡量一下,你就会知道.

Personally I wouldn't optimise by using scatter/gather writes with multiple WSABUFs until you have the base working and you know that doing so actually improves performance, I doubt that it will if you have enough data already pending; but as always, measure and you will know.

64 字节太小了.

你可能已经看过这个,但我在这里写过这个主题:http://www.lenholgate.com/blog/2008/03/bug-in-timer-queue-code.html 虽然对您来说可能太模糊了.

You may have already seen this but I wrote about the subject here: http://www.lenholgate.com/blog/2008/03/bug-in-timer-queue-code.html though it's possibly too vague for you.

这篇关于实施良好的“发送"使用 TCP 排队的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆