限制TCP发送带有“待发送"的消息.排队和其他设计问题 [英] Limiting TCP sends with a "to-be-sent" queue and other design issues

查看:81
本文介绍了限制TCP发送带有“待发送"的消息.排队和其他设计问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题是我最近几天问过的另外两个问题的结果.
我正在创建一个新问题,因为我认为这与我对如何控制发送/接收流程的理解中的下一步"有关,而我还没有得到完整的答案.
其他相关问题是:
IOCP文档解释问题-缓冲区所有权不明确
非阻塞TCP缓冲区问题

This question is the result of two other questions I've asked in the last few days.
I'm creating a new question because I think it's related to the "next step" in my understanding of how to control the flow of my send/receive, something I didn't get a full answer to yet.
The other related questions are:
An IOCP documentation interpretation question - buffer ownership ambiguity
Non-blocking TCP buffer issues

总而言之,我正在使用Windows I/O完成端口.
我有几个线程处理来自完成端口的通知.
我相信这个问题是与平台无关的,并且答案将与在* nix,* BSD,Solaris系统上执行相同操作的答案相同.

In summary, I'm using Windows I/O Completion Ports.
I have several threads that process notifications from the completion port.
I believe the question is platform-independent and would have the same answer as if to do the same thing on a *nix, *BSD, Solaris system.

因此,我需要拥有自己的流控制系统.很好.
因此,我发送了很多,发送了很多. 我如何知道何时开始对发送进行排队,因为接收方限制在X数量之内?

So, I need to have my own flow control system. Fine.
So I send send and send, a lot. How do I know when to start queueing the sends, as the receiver side is limited to X amount?

让我们举个例子(最接近我的问题):FTP协议.
我有两个服务器;一个在100Mb链接上,另一个在10Mb链接上.
我命令一个100Mb的文件发送到另一个(10Mb链接的文件)一个1GB的文件.它的平均传输速率为1.25MB/s.
发送方(100Mb链接的发送方)如何知道何时暂停发送,因此较慢的发送方不会被淹没? (在这种情况下,待发送"队列是硬盘上的实际文件.)

Let's take an example (closest thing to my question): FTP protocol.
I have two servers; One is on a 100Mb link and the other is on a 10Mb link.
I order the 100Mb one to send to the other one (the 10Mb linked one) a 1GB file. It finishes with an average transfer rate of 1.25MB/s.
How did the sender (the 100Mb linked one) knew when to hold the sending, so the slower one wouldn't be flooded? (In this case the "to-be-sent" queue is the actual file on the hard-disk).

另一种询问方式:
我可以从远端收到保留您的发送"通知吗?它是内置在TCP中还是需要我执行的所谓可靠网络协议"?

Another way to ask this:
Can I get a "hold-your-sendings" notification from the remote side? Is it built-in in TCP or the so called "reliable network protocol" needs me to do so?

我当然可以将发送的内容限制为固定的字节数,但这对我来说听起来并不正确.

I could of course limit my sendings to a fixed number of bytes but that simply doesn't sound right to me.

同样,我有一个循环,其中有许多发送到远程服务器的循环,在某个时候,在该循环中,我必须确定是否应该将发送排队,或者可以将其传递给传输层(TCP).
我怎么做?你会怎么做?当然,当我从IOCP收到完成通知,即发送已完成时,我将发出其他挂起的发送,这很清楚.

Again, I have a loop with many sends to a remote server, and at some point, within that loop I'll have to determine if I should queue that send or I can pass it on to the transport layer (TCP).
How do I do that? What would you do? Of course that when I get a completion notification from IOCP that the send was done I'll issue other pending sends, that's clear.

与此相关的另一个设计问题:
由于我要在发送队列中使用自定义缓冲区,并且在到达发送完成"通知后,这些缓冲区将被释放以供重用(因此不使用"delete"关键字),因此,我必须使用在该缓冲池上相互排斥.
使用互斥锁会使事情变慢,所以我一直在想;为什么每个线程都没有自己的缓冲池,因此至少在获取发送操作所需的缓冲区时访问它就不需要互斥体,因为它仅属于该线程.
缓冲池位于线程本地存储(TLS)级别.
没有相互池意味着没有锁,意味着更快的操作BUT也意味着应用程序使用了更多的内存,因为即使一个线程已经分配了1000个缓冲区,另一个正在发送并且需要1000个缓冲区来发送内容的线程也需要分配这些都是自己的.

Another design question related to this:
Since I am to use a custom buffers with a send queue, and these buffers are being freed to be reused (thus not using the "delete" keyword) when a "send-done" notification has been arrived, I'll have to use a mutual exlusion on that buffer pool.
Using a mutex slows things down, so I've been thinking; Why not have each thread have its own buffers pool, thus accessing it , at least when getting the required buffers for a send operation, will require no mutex, because it belongs to that thread only.
The buffers pool is located at the thread local storage (TLS) level.
No mutual pool implies no lock needed, implies faster operations BUT also implies more memory used by the app, because even if one thread already allocated 1000 buffers, the other one that is sending right now and need 1000 buffers to send something will need to allocated these to its own.

另一个问题:
假设我在待发送"队列中有缓冲区A,B,C.
然后,我收到一个完成通知,告诉我接收者从15个字节中提取了10个.我应该从缓冲区的相对偏移量重新发送,还是由TCP为我处理,即完成发送?如果可以的话,我可以确保该缓冲区是队列中下一个要发送的"缓冲区吗?或者例如它可以是缓冲区B?

Another issue:
Say I have buffers A, B, C in the "to-be-sent" queue.
Then I get a completion notification that tells me that the receiver got 10 out of 15 bytes. Should I re-send from the relative offset of the buffer, or will TCP handle it for me, i.e complete the sending? And if I should, can I be assured that this buffer is the "next-to-be-sent" one in the queue or could it be buffer B for example?

这是一个很长的问题,我希望没有人受伤(:

This is a long question and I hope none got hurt (:

我很想看到某人花时间在这里回答.我保证我会为他加倍投票! (:
谢谢大家!

I'd loveeee to see someone takes the time to answer here. I promise I'll double-vote for him! (:
Thank you all!

推荐答案

首先:我会作为单独的问题问这个问题.您更有可能以这种方式获得答案.

Firstly: I'd ask this as separate questions. You're more likely to get answers that way.

我已经在博客中谈到了大部分内容: http://www.lenholgate.com 但是之后,由于您已经给我发送了电子邮件,说您阅读了我的博客,因此知道...

I've spoken about most of this on my blog: http://www.lenholgate.com but then since you've already emailed me to say that you read my blog you know that...

TCP流控制问题是这样的,因为您要发布异步写入,并且这些异步写入都将使用资源,直到它们完成为止(请参见

The TCP flow control issue is such that since you are posting asynchronous writes and these each use resources until they complete (see here). During the time that the write is pending there are various resource usage issues to be aware of and the use of your data buffer is the least important of them; you'll also use up some non-paged pool which is a finite resource (though there is much more available in Vista and later than previous operating systems), you'll also be locking pages in memory for the duration of the write and there's a limit to the total number of pages that the OS can lock. Note that both the non-paged pool usage and page locking issues aren't something that's documented very well anywhere, but you'll start seeing writes fail with ENOBUFS once you hit them.

由于这些问题,未控制的写入数量不受控制是不明智的.如果您要发送大量数据,并且没有应用程序级别的流控制,则需要注意,如果发送数据的速度快于连接另一端可以处理的速度,或者快于链接速度,然后由于TCP流控制和窗口问题,您的写操作将花费更长的时间,因此您将开始消耗大量上述资源.阻塞套接字代码不会带来这些问题,因为当TCP堆栈由于流控制问题而无法再写入时,写入调用只会阻塞.使用异步写入将完成写入,然后将其挂起.使用阻塞代码,阻塞将为您处理流控制.使用异步写入,您可能会继续循环运行,并且越来越多的数据正等待TCP堆栈发送...

Due to these issues it's not wise to have an uncontrolled number of writes pending. If you are sending a large amount of data and you have a no application level flow control then you need to be aware that if you send data faster than it can be processed by the other end of the connection, or faster than the link speed, then you will begin to use up lots and lots of the above resources as your writes take longer to complete due to TCP flow control and windowing issues. You don't get these problems with blocking socket code as the write calls simply block when the TCP stack can't write any more due to flow control issues; with async writes the writes complete and are then pending. With blocking code the blocking deals with your flow control for you; with async writes you could continue to loop and more and more data which is all just waiting to be sent by the TCP stack...

无论如何,因此,对于Windows上的异步I/O,您应该始终具有某种形式的显式流控制.因此,您可以使用ACK将应用程序级流控制添加到协议中,以便知道何时数据到达另一端,并且一次只能允许一定数量的未完成交易,或者如果您不能添加到在应用程序级别协议中,您可以使用写入完成来驱动事物.诀窍是允许每个连接有一定数量的未完成写入完成,并在达到极限后将数据排队(或只是不生成数据).然后,在每次写入完成后,您可以生成一个新的写入....

Anyway, because of this, with async I/O on Windows you should ALWAYS have some form of explicit flow control. So, you either add application level flow control to your protocol, using an ACK, perhaps, so that you know when the data has reached the other side and only allow a certain amount to be outstanding at any one time OR if you cant add to the application level protocol, you can drive things by using your write completions. The trick is to allow a certain number of outstanding write completions per connection and to queue the data (or just don't generate it) once you have reached your limit. Then as each write completes you can generate a new write....

恕我直言,您关于缓冲数据缓冲区的问题是您现在过早的优化.到达系统正常运行的位置,并且已经对系统进行了概要分析,发现缓冲池上的争用是最重要的热点,然后解决了这一问题.我发现每个线程的缓冲池不能很好地工作,因为线程之间的分配和释放的分布往往不如您需要的那样平衡.我已经在我的博客上谈到了更多:

Your question about pooling the data buffers is, IMHO, premature optimisation on your part right now. Get to the point where your system is working properly and you have profiled your system and found that the contention on your buffer pool is the most important hot spot and THEN address it. I found that per thread buffer pools didn't work so well as the distribution of allocations and frees across threads tends not to be as balanced as you'd need to that to work. I've spoken about this more on my blog: http://www.lenholgate.com/blog/2010/05/performance-comparisons-for-recent-code-changes.html

您关于部分写完成(您发送100个字节,并且完成返回并说您只发送了95个)的问题实际上在IMHO中不是问题.如果您到达这个位置并拥有多个未完成的写操作,那么您将无能为力,随后的写操作可能会正常工作,并且您期望发送的字节会丢失;但是a)我从来没有见过这种情况发生,除非您已经遇到了上面我详细介绍的资源问题,并且b)如果您已经在该连接上发布了更多写入信息,那么您将无能为力,因此只需中止该连接-请注意,这是为什么我总是将我的网络系统配置在将要运行的硬件上,并且我倾向于在我的代码中设置限制,以防止曾经达到操作系统资源限制(Vista之前的操作系统上的不良驱动程序通常会蓝屏,如果可以的话).不要使用非分页池,因此如果您不注意这些细节,可以放下一个盒子.

Your question about partial write completions (you send 100 bytes and the completion comes back and says that you have only sent 95) isn't really a problem in practice IMHO. If you get to this position and have more than the one outstanding write then there's nothing you can do, the subsequent writes may well work and you'll have bytes missing from what you expected to send; BUT a) I've never seen this happen unless you have already hit the resource problems that I detail above and b) there's nothing you can do if you have already posted more writes on that connection so simply abort the connection - note that this is why I always profile my networking systems on the hardware that they will run on and I tend to place limits in MY code to prevent the OS resource limits ever being reached (bad drivers on pre Vista operating systems often blue screen the box if they can't get non paged pool so you can bring a box down if you don't pay careful attention to these details).

请下次再次提问.

这篇关于限制TCP发送带有“待发送"的消息.排队和其他设计问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆