SO_REUSEADDR和SO_REUSEPORT有何区别? [英] How do SO_REUSEADDR and SO_REUSEPORT differ?

查看:83
本文介绍了SO_REUSEADDR和SO_REUSEPORT有何区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

套接字选项SO_REUSEADDRSO_REUSEPORTman pages和程序员文档对于不同的操作系统是不同的,并且常常引起混乱.某些操作系统甚至没有选项SO_REUSEPORT. WEB充斥着与此主题相矛盾的信息,通常您会发现仅对于特定操作系统的一个套接字实现才是正确的信息,甚至在本文中也没有明确提及.

The man pages and programmer documentations for the socket options SO_REUSEADDR and SO_REUSEPORT are different for different operating systems and often highly confusing. Some operating systems don't even have the option SO_REUSEPORT. The WEB is full of contradicting information regarding this subject and often you can find information that is only true for one socket implementation of a specific operating system, which may not even be explicitly mentioned in the text.

那么SO_REUSEADDRSO_REUSEPORT有何不同?

没有SO_REUSEPORT的系统受到更多限制吗?

Are systems without SO_REUSEPORT more limited?

如果我在不同的操作系统上使用任一种,那么预期的行为到底是什么?

And what exactly is the expected behavior if I use either one on different operating systems?

推荐答案

欢迎来到美好的可移植性世界……或者说缺少它.在开始详细分析这两个选项并深入了解不同的操作系统如何处理它们之前,应注意的是BSD套接字实现是所有套接字实现的基础.基本上,所有其他系统都在某个时间点(或至少是其接口)复制了BSD套接字实现,然后开始自行发展.当然,BSD套接字实现也同时进行了改进,因此后来复制它的系统具有早期复制它的系统所缺少的功能.理解BSD套接字实现是理解所有其他套接字实现的关键,因此即使您不关心为BSD系统编写代码,也应该阅读它.

Welcome to the wonderful world of portability... or rather the lack of it. Before we start analyzing these two options in detail and take a deeper look how different operating systems handle them, it should be noted that the BSD socket implementation is the mother of all socket implementations. Basically all other systems copied the BSD socket implementation at some point in time (or at least its interfaces) and then started evolving it on their own. Of course the BSD socket implementation was evolved as well at the same time and thus systems that copied it later got features that were lacking in systems that copied it earlier. Understanding the BSD socket implementation is the key to understanding all other socket implementations, so you should read about it even if you don't care to ever write code for a BSD system.

在研究这两个选项之前,您应该了解一些基本知识. TCP/UDP连接由五个值的元组标识:

There are a couple of basics you should know before we look at these two options. A TCP/UDP connection is identified by a tuple of five values:

{<protocol>, <src addr>, <src port>, <dest addr>, <dest port>}

这些值的任何唯一组合都将标识一个连接.结果,两个连接不能具有相同的五个值,否则系统将无法再区分这些连接.

Any unique combination of these values identifies a connection. As a result, no two connections can have the same five values, otherwise the system would not be able to distinguish these connections any longer.

使用socket()函数创建套接字时,将设置套接字的协议.源地址和端口是通过bind()功能设置的.目的地址和端口是通过connect()功能设置的.由于UDP是无连接协议,因此可以在不连接UDP套接字的情况下使用UDP套接字.但是允许将它们连接起来,在某些情况下对于您的代码和常规应用程序设计非常有利.在无连接模式下,首次通过其发送数据时未显式绑定的UDP套接字通常由系统自动绑定,因为未绑定的UDP套接字无法接收任何(答复)数据.对于未绑定的TCP套接字也是如此,它将在连接之前自动绑定.

The protocol of a socket is set when a socket is created with the socket() function. The source address and port are set with the bind() function. The destination address and port are set with the connect() function. Since UDP is a connectionless protocol, UDP sockets can be used without connecting them. Yet it is allowed to connect them and in some cases very advantageous for your code and general application design. In connectionless mode, UDP sockets that were not explicitly bound when data is sent over them for the first time are usually automatically bound by the system, as an unbound UDP socket cannot receive any (reply) data. Same is true for an unbound TCP socket, it is automatically bound before it will be connected.

如果显式绑定套接字,则可以将其绑定到端口0,这意味着任何端口".由于套接字不能真正绑定到所有现有端口,因此在这种情况下,系统将必须选择特定的端口本身(通常是从预定义的,操作系统特定的源端口范围中选择).对于源地址,存在类似的通配符,该通配符可以是任何地址"(对于IPv4,为0.0.0.0,对于IPv6为::).与端口不同,套接字实际上可以绑定到任何地址",这意味着所有本地接口的所有源IP地址".如果稍后再连接套接字,则系统必须选择特定的源IP地址,因为套接字无法连接,并且同时绑定到任何本地IP地址.根据目标地址和路由表的内容,系统将选择适当的源地址,并将"any"绑定替换为对所选源IP地址的绑定.

If you explicitly bind a socket, it is possible to bind it to port 0, which means "any port". Since a socket cannot really be bound to all existing ports, the system will have to choose a specific port itself in that case (usually from a predefined, OS specific range of source ports). A similar wildcard exists for the source address, which can be "any address" (0.0.0.0 in case of IPv4 and :: in case of IPv6). Unlike in case of ports, a socket can really be bound to "any address" which means "all source IP addresses of all local interfaces". If the socket is connected later on, the system has to choose a specific source IP address, since a socket cannot be connected and at the same time be bound to any local IP address. Depending on the destination address and the content of the routing table, the system will pick an appropriate source address and replace the "any" binding with a binding to the chosen source IP address.

默认情况下,任何两个套接字都不能绑定到源地址和源端口的相同组合.只要源端口不同,源地址实际上就无关紧要.只要X != Y成立,总是可以将socketA绑定到A:X并将socketB绑定到B:Y,其中AB是地址,并且XY是端口.但是,即使X == Y,只要A != B成立,绑定仍然可能.例如. socketA属于FTP服务器程序并绑定到192.168.0.1:21,而socketB属于另一个FTP服务器程序并绑定到10.0.0.1:21,两个绑定都将成功.但是请记住,套接字可能在本地绑定到任何地址".如果将套接字绑定到0.0.0.0:21,则它将同时绑定到所有现有的本地地址,在这种情况下,不能将其他套接字绑定到端口21,无论它尝试绑定到哪个特定IP地址,例如0.0.0.0与所有现有的本地IP地址冲突.

By default, no two sockets can be bound to the same combination of source address and source port. As long as the source port is different, the source address is actually irrelevant. Binding socketA to A:X and socketB to B:Y, where A and B are addresses and X and Y are ports, is always possible as long as X != Y holds true. However, even if X == Y, the binding is still possible as long as A != B holds true. E.g. socketA belongs to a FTP server program and is bound to 192.168.0.1:21 and socketB belongs to another FTP server program and is bound to 10.0.0.1:21, both bindings will succeed. Keep in mind, though, that a socket may be locally bound to "any address". If a socket is bound to 0.0.0.0:21, it is bound to all existing local addresses at the same time and in that case no other socket can be bound to port 21, regardless which specific IP address it tries to bind to, as 0.0.0.0 conflicts with all existing local IP addresses.

到目前为止,所有主要操作系统都说得差不多.当地址重用发挥作用时,事情开始变得特定于操作系统.我们从BSD开始,因为如上所述,它是所有套接字实现的基础.

Anything said so far is pretty much equal for all major operating system. Things start to get OS specific when address reuse comes into play. We start with BSD, since as I said above, it is the mother of all socket implementations.

如果在绑定套接字之前启用了SO_REUSEADDR,则该套接字可以成功绑定,除非与完全完全相同的源地址和端口组合的另一个套接字发生冲突.现在您可能想知道与以前有什么不同?关键字是完全". SO_REUSEADDR主要改变搜索冲突时处理通配符地址(任何IP地址")的方式.

If SO_REUSEADDR is enabled on a socket prior to binding it, the socket can be successfully bound unless there is a conflict with another socket bound to exactly the same combination of source address and port. Now you may wonder how is that any different than before? The keyword is "exactly". SO_REUSEADDR mainly changes the way how wildcard addresses ("any IP address") are treated when searching for conflicts.

没有SO_REUSEADDR,将socketA绑定到0.0.0.0:21,然后将socketB绑定到192.168.0.1:21将会失败(错误EADDRINUSE),因为0.0.0.0表示任何本地IP地址",因此所有该套接字将使用本地IP地址,并且该地址也包括192.168.0.1.使用SO_REUSEADDR,它将成功,因为0.0.0.0192.168.0.1不是完全相同的地址,一个是所有本地地址的通配符,另一个是一个非常特定的本地地址.请注意,无论以socketAsocketB的顺序绑定,上述陈述都是正确的;如果没有SO_REUSEADDR,它将始终失败,而使用SO_REUSEADDR,它将始终成功.

Without SO_REUSEADDR, binding socketA to 0.0.0.0:21 and then binding socketB to 192.168.0.1:21 will fail (with error EADDRINUSE), since 0.0.0.0 means "any local IP address", thus all local IP addresses are considered in use by this socket and this includes 192.168.0.1, too. With SO_REUSEADDR it will succeed, since 0.0.0.0 and 192.168.0.1 are not exactly the same address, one is a wildcard for all local addresses and the other one is a very specific local address. Note that the statement above is true regardless in which order socketA and socketB are bound; without SO_REUSEADDR it will always fail, with SO_REUSEADDR it will always succeed.

为了给您更好的概述,让我们在此处制作表格并列出所有可能的组合:

To give you a better overview, let's make a table here and list all possible combinations:


SO_REUSEADDR       socketA        socketB       Result
---------------------------------------------------------------------
  ON/OFF       192.168.0.1:21   192.168.0.1:21    Error (EADDRINUSE)
  ON/OFF       192.168.0.1:21      10.0.0.1:21    OK
  ON/OFF          10.0.0.1:21   192.168.0.1:21    OK
   OFF             0.0.0.0:21   192.168.1.0:21    Error (EADDRINUSE)
   OFF         192.168.1.0:21       0.0.0.0:21    Error (EADDRINUSE)
   ON              0.0.0.0:21   192.168.1.0:21    OK
   ON          192.168.1.0:21       0.0.0.0:21    OK
  ON/OFF           0.0.0.0:21       0.0.0.0:21    Error (EADDRINUSE)

上表假设socketA已成功绑定到为socketA指定的地址,然后创建socketB,设置或未设置SO_REUSEADDR,最后绑定到指定的地址. socketB. ResultsocketB的绑定操作的结果.如果第一列显示ON/OFF,则SO_REUSEADDR的值与结果无关.

The table above assumes that socketA has already been successfully bound to the address given for socketA, then socketB is created, either gets SO_REUSEADDR set or not, and finally is bound to the address given for socketB. Result is the result of the bind operation for socketB. If the first column says ON/OFF, the value of SO_REUSEADDR is irrelevant to the result.

好的,SO_REUSEADDR对通配符地址有影响,这是很容易知道的.但这不是唯一的效果.还有另一个众所周知的效果,这也是大多数人首先在服务器程序中使用SO_REUSEADDR的原因.对于此选项的其他重要用途,我们必须更深入地研究TCP协议的工作原理.

Okay, SO_REUSEADDR has an effect on wildcard addresses, good to know. Yet that isn't it's only effect it has. There is another well known effect which is also the reason why most people use SO_REUSEADDR in server programs in the first place. For the other important use of this option we have to take a deeper look on how the TCP protocol works.

套接字有一个发送缓冲区,并且如果对send()函数的调用成功,则并不意味着所请求的数据实际上已经被发送出去,仅意味着已将数据添加到了发送缓冲区中.对于UDP套接字,数据通常会很快发送,即使不是立即发送,但对于TCP套接字,在将数据添加到发送缓冲区和让TCP实现真正发送该数据之间可能会有相对较长的延迟.如此一来,当您关闭TCP套接字时,由于send()调用成功,所以发送缓冲区中仍可能有尚未发送的数据,这些数据尚未发送,但您的代码将其视为已发送.如果TCP实现根据您的请求立即关闭套接字,那么所有这些数据都将丢失,并且您的代码甚至都不知道.据说TCP是可靠的协议,丢失数据不是很可靠.这就是为什么当您关闭套接字时仍具有要发送的数据的套接字将进入称为TIME_WAIT的状态的原因.在这种状态下,它将等待,直到所有未决数据已成功发送或直到发生超时为止,在这种情况下,将强制关闭套接字.

A socket has a send buffer and if a call to the send() function succeeds, it does not mean that the requested data has actually really been sent out, it only means the data has been added to the send buffer. For UDP sockets, the data is usually sent pretty soon, if not immediately, but for TCP sockets, there can be a relatively long delay between adding data to the send buffer and having the TCP implementation really send that data. As a result, when you close a TCP socket, there may still be pending data in the send buffer, which has not been sent yet but your code considers it as sent, since the send() call succeeded. If the TCP implementation was closing the socket immediately on your request, all of this data would be lost and your code wouldn't even know about that. TCP is said to be a reliable protocol and losing data just like that is not very reliable. That's why a socket that still has data to send will go into a state called TIME_WAIT when you close it. In that state it will wait until all pending data has been successfully sent or until a timeout is hit, in which case the socket is closed forcefully.

内核将在关闭套接字之前等待的时间(无论它是否仍在传输数据)称为 Linger Time . Linger Time 在大多数系统上都是全局可配置的,默认情况下是相当长的(两分钟是您在许多系统上常见的值).也可以使用套接字选项SO_LINGER每个套接字进行配置,该选项可用于使超时时间变短或变长,甚至完全禁用超时.但是,完全禁用它是一个非常糟糕的主意,因为优雅地关闭TCP套接字是一个稍微复杂的过程,涉及到来回发送两个数据包(以及在丢失数据包时重新发送)以及整个关闭过程.也受歌手时间的限制.如果禁用延迟,则套接字不仅可能会丢失飞行中的数据,而且始终会强制关闭而不是正常关闭,通常不建议这样做.关于如何正常关闭TCP连接的详细信息不在此答案的范围内,如果您想了解更多信息,建议您查看

The amount of time the kernel will wait before it closes the socket, regardless if it still has data in flight or not, is called the Linger Time. The Linger Time is globally configurable on most systems and by default rather long (two minutes is a common value you will find on many systems). It is also configurable per socket using the socket option SO_LINGER which can be used to make the timeout shorter or longer, and even to disable it completely. Disabling it completely is a very bad idea, though, since closing a TCP socket gracefully is a slightly complex process and involves sending forth and back a couple of packets (as well as resending those packets in case they got lost) and this whole close process is also limited by the Linger Time. If you disable lingering, your socket may not only lose data in flight, it is also always closed forcefully instead of gracefully, which is usually not recommended. The details about how a TCP connection is closed gracefully are beyond the scope of this answer, if you want to learn more about, I recommend you have a look at this page. And even if you disabled lingering with SO_LINGER, if your process dies without explicitly closing the socket, BSD (and possibly other systems) will linger nonetheless, ignoring what you have configured. This will happen for example if your code just calls exit() (pretty common for tiny, simple server programs) or the process is killed by a signal (which includes the possibility that it simply crashes because of an illegal memory access). So there is nothing you can do to make sure a socket will never linger under all circumstances.

问题是,系统如何处理状态为TIME_WAIT的套接字?如果未设置SO_REUSEADDR,则状态为TIME_WAIT的套接字仍被视为已绑定到源地址和端口,并且任何将新套接字绑定到相同地址和端口的尝试都将失败,直到该套接字真正被关闭为止,这可能需要配置的 Linger Time .因此,不要期望关闭套接字后可以立即重新绑定套接字的源地址.在大多数情况下,这将失败.但是,如果为您要绑定的套接字设置了SO_REUSEADDR,则在状态"TIME_WAIT"中绑定到相同地址和端口的另一个套接字将被忽略,因为所有这些套接字都已经半死"了,并且您的套接字可以绑定到完全相同的地址,没有任何问题.在那种情况下,另一个套接字可能具有完全相同的地址和端口不起作用.请注意,如果另一个套接字仍在工作",则将一个套接字绑定到与处于TIME_WAIT状态的垂死套接字的地址和端口完全相同的地址和端口可能会产生意外的(通常是不希望的)副作用,但是这超出了范围.这个答案和幸运的是,这些副作用在实践中很少见.

The question is, how does the system treat a socket in state TIME_WAIT? If SO_REUSEADDR is not set, a socket in state TIME_WAIT is considered to still be bound to the source address and port and any attempt to bind a new socket to the same address and port will fail until the socket has really been closed, which may take as long as the configured Linger Time. So don't expect that you can rebind the source address of a socket immediately after closing it. In most cases this will fail. However, if SO_REUSEADDR is set for the socket you are trying to bind, another socket bound to the same address and port in state TIME_WAIT is simply ignored, after all its already "half dead", and your socket can bind to exactly the same address without any problem. In that case it plays no role that the other socket may have exactly the same address and port. Note that binding a socket to exactly the same address and port as a dying socket in TIME_WAIT state can have unexpected, and usually undesired, side effects in case the other socket is still "at work", but that is beyond the scope of this answer and fortunately those side effects are rather rare in practice.

关于SO_REUSEADDR,您应该了解的最后一件事.只要您要绑定的套接字启用了地址重用,上面编写的所有内容都将起作用.另一个套接字(已绑定或处于TIME_WAIT状态)不必在绑定时也设置此标志.决定绑定是成功还是失败的代码仅检查馈入bind()调用的套接字的SO_REUSEADDR标志,对于检查的所有其他套接字,甚至不会查看此标志.

There is one final thing you should know about SO_REUSEADDR. Everything written above will work as long as the socket you want to bind to has address reuse enabled. It is not necessary that the other socket, the one which is already bound or is in a TIME_WAIT state, also had this flag set when it was bound. The code that decides if the bind will succeed or fail only inspects the SO_REUSEADDR flag of the socket fed into the bind() call, for all other sockets inspected, this flag is not even looked at.

SO_REUSEPORT是大多数人期望的SO_REUSEADDR.基本上,SO_REUSEPORT允许您将任意数量的套接字绑定到完全相同的源地址和端口,只要之前 all 个先前绑定的套接字还设置了SO_REUSEPORT他们被束缚了.如果绑定到地址和端口的第一个套接字没有设置SO_REUSEPORT,则任何其他套接字都不能绑定到完全相同的地址和端口,无论该另一个套接字是否设置了SO_REUSEPORT,直到第一个套接字没有绑定.套接字再次释放其绑定.与SO_REUESADDR的情况不同,代码处理SO_REUSEPORT不仅将验证当前绑定的套接字设置了SO_REUSEPORT,而且还将验证绑定地址和端口的套接字在绑定时设置了SO_REUSEPORT

SO_REUSEPORT is what most people would expect SO_REUSEADDR to be. Basically, SO_REUSEPORT allows you to bind an arbitrary number of sockets to exactly the same source address and port as long as all prior bound sockets also had SO_REUSEPORT set before they were bound. If the first socket that is bound to an address and port does not have SO_REUSEPORT set, no other socket can be bound to exactly the same address and port, regardless if this other socket has SO_REUSEPORT set or not, until the first socket releases its binding again. Unlike in case of SO_REUESADDR the code handling SO_REUSEPORT will not only verify that the currently bound socket has SO_REUSEPORT set but it will also verify that the socket with a conflicting address and port had SO_REUSEPORT set when it was bound.

SO_REUSEPORT并不表示SO_REUSEADDR.这意味着,如果一个套接字在绑定时没有设置SO_REUSEPORT,而另一个套接字在绑定到完全相同的地址和端口时设置了SO_REUSEPORT,则绑定将失败,这是正常的,但是如果绑定了一个套接字,则绑定也会失败.其他套接字已经快要死了并且处于TIME_WAIT状态.为了能够将套接字与另一个处于TIME_WAIT状态的套接字绑定到相同的地址和端口,需要在该套接字上设置SO_REUSEADDR,或者必须在两个套接字上均设置SO_REUSEPORT在绑定它们之前.当然,允许在套接字上同时设置SO_REUSEPORTSO_REUSEADDR.

SO_REUSEPORT does not imply SO_REUSEADDR. This means if a socket did not have SO_REUSEPORT set when it was bound and another socket has SO_REUSEPORT set when it is bound to exactly the same address and port, the bind fails, which is expected, but it also fails if the other socket is already dying and is in TIME_WAIT state. To be able to bind a socket to the same addresses and port as another socket in TIME_WAIT state requires either SO_REUSEADDR to be set on that socket or SO_REUSEPORT must have been set on both sockets prior to binding them. Of course it is allowed to set both, SO_REUSEPORT and SO_REUSEADDR, on a socket.

除了SO_REUSEADDR是在SO_REUSEADDR之后添加的,关于SO_REUSEPORT的内容没有太多要说的了,这就是为什么您不会在其他系统的许多套接字实现中找到它的原因,因此在此之前,该系统分叉"了BSD代码.选项已添加,并且在此选项之前,无法将两个套接字绑定到BSD中完全相同的套接字地址.

There is not much more to say about SO_REUSEPORT other than that it was added later than SO_REUSEADDR, that's why you will not find it in many socket implementations of other systems, which "forked" the BSD code before this option was added, and that there was no way to bind two sockets to exactly the same socket address in BSD prior to this option.

大多数人都知道bind()可能会因错误EADDRINUSE而失败,但是,当您开始尝试地址重用时,您可能会遇到奇怪的情况,即connect()也会因该错误而失败.怎么会这样?将连接添加到套接字后,如何才能使用远程地址?将多个套接字连接到完全相同的远程地址以前从来都不是问题,所以这里出了什么问题?

Most people know that bind() may fail with the error EADDRINUSE, however, when you start playing around with address reuse, you may run into the strange situation that connect() fails with that error as well. How can this be? How can a remote address, after all that's what connect adds to a socket, be already in use? Connecting multiple sockets to exactly the same remote address has never been a problem before, so what's going wrong here?

正如我在答复的最顶部所说的那样,连接是由五个值的元组定义的,还记得吗?我还说过,这五个值必须唯一,否则系统将无法再区分两个连接,对吗?好了,通过地址重用,您可以将相同协议的两个套接字绑定到相同的源地址和端口.这意味着对于这两个套接字,这五个值中的三个已经相同.如果现在尝试将这两个套接字也都连接到相同的目标地址和端口,则将创建两个已连接的套接字,它们的元组绝对相同.这是行不通的,至少不适用于TCP连接(无论如何,UDP连接都不是真正的连接).如果两个连接之一的数据到达,系统将无法确定该数据属于哪个连接.至少每个连接的目的地址或目的端口都必须不同,这样系统就可以毫无问题地确定输入数据属于哪个连接.

As I said on the very top of my reply, a connection is defined by a tuple of five values, remember? And I also said, that these five values must be unique otherwise the system cannot distinguish two connections any longer, right? Well, with address reuse, you can bind two sockets of the same protocol to the same source address and port. That means three of those five values are already the same for these two sockets. If you now try to connect both of these sockets also to the same destination address and port, you would create two connected sockets, whose tuples are absolutely identical. This cannot work, at least not for TCP connections (UDP connections are no real connections anyway). If data arrived for either one of the two connections, the system could not tell which connection the data belongs to. At least the destination address or destination port must be different for either connection, so that the system has no problem to identify to which connection incoming data belongs to.

因此,如果将具有相同协议的两个套接字绑定到相同的源地址和端口,并尝试将它们都连接到相同的目标地址和端口,则connect()实际上将失败,并且第二个套接字出现错误EADDRINUSE您尝试连接,这意味着已经连接了具有五个值的相同元组的套接字.

So if you bind two sockets of the same protocol to the same source address and port and try to connect them both to the same destination address and port, connect() will actually fail with the error EADDRINUSE for the second socket you try to connect, which means that a socket with an identical tuple of five values is already connected.

大多数人都忽略多播地址存在的事实,但是确实存在.单播地址用于一对一通信,而多播地址用于一对多通信.大多数人在了解IPv6时就知道了组播地址,但是IPv4中也存在组播地址,即使该功能从未在公共Internet上广泛使用.

Most people ignore the fact that multicast addresses exist, but they do exist. While unicast addresses are used for one-to-one communication, multicast addresses are used for one-to-many communication. Most people got aware of multicast addresses when they learned about IPv6 but multicast addresses also existed in IPv4, even though this feature was never widely used on the public Internet.

SO_REUSEADDR的含义对于多播地址有所变化,因为它允许将多个套接字绑定到源多播地址和端口的完全相同的组合.换句话说,对于多播地址,SO_REUSEADDR的行为与对于单播地址的SO_REUSEPORT完全相同.实际上,该代码对多播地址相同地对待SO_REUSEADDRSO_REUSEPORT,这意味着您可以说SO_REUSEADDR对于所有多播地址都意味着SO_REUSEPORT,反之亦然.

The meaning of SO_REUSEADDR changes for multicast addresses as it allows multiple sockets to be bound to exactly the same combination of source multicast address and port. In other words, for multicast addresses SO_REUSEADDR behaves exactly as SO_REUSEPORT for unicast addresses. Actually, the code treats SO_REUSEADDR and SO_REUSEPORT identically for multicast addresses, that means you could say that SO_REUSEADDR implies SO_REUSEPORT for all multicast addresses and the other way round.

所有这些都是BSD原始代码的较晚分支,这就是为什么它们三个都提供与BSD相同的选项,并且它们的行为也与BSD相同的原因.

All these are rather late forks of the original BSD code, that's why they all three offer the same options as BSD and they also behave the same way as in BSD.

从本质上讲,macOS只是一个名为" Darwin "的BSD风格的UNIX,它基于BSD代码(BSD 4.3)的较晚分支,后来又重新命名为-与Mac OS 10.3发行版的(当时)FreeBSD 5代码库同步,因此Apple可以完全符合POSIX(macOS已通过POSIX认证).尽管内核具有微内核(" Mach "),但内核的其余部分(" XNU ")基本上只是一个BSD内核,这就是macOS提供该内核的原因.与BSD相同的选项,并且它们的行为也与BSD相同.

At its core, macOS is simply a BSD-style UNIX named "Darwin", based on a rather late fork of the BSD code (BSD 4.3), which was then later on even re-synchronized with the (at that time current) FreeBSD 5 code base for the Mac OS 10.3 release, so that Apple could gain full POSIX compliance (macOS is POSIX certified). Despite having a microkernel at its core ("Mach"), the rest of the kernel ("XNU") is basically just a BSD kernel, and that's why macOS offers the same options as BSD and they also behave the same way as in BSD.

iOS只是一个macOS分支,其内核进行了稍微修改和修剪,用户空间工具集有所减少,默认框架集也有所不同. watchOS和tvOS是iOS的分支,它们被进一步简化(尤其是watchOS).据我所知,它们的行为都完全与macOS一样.

iOS is just a macOS fork with a slightly modified and trimmed kernel, somewhat stripped down user space toolset and a slightly different default framework set. watchOS and tvOS are iOS forks, that are stripped down even further (especially watchOS). To my best knowledge they all behave exactly as macOS does.

在Linux 3.9之前,仅存在选项SO_REUSEADDR.此选项的行为通常与BSD中的行为相同,但有两个重要的例外:

Prior to Linux 3.9, only the option SO_REUSEADDR existed. This option behaves generally the same as in BSD with two important exceptions:

  1. 只要将侦听(服务器)TCP套接字绑定到特定端口,针对该端口的所有套接字都将完全忽略SO_REUSEADDR选项.只有在没有设置SO_REUSEADDR的BSD中也可以将第二个套接字绑定到同一端口时,才有可能.例如.您不能绑定到通配符地址,然后再绑定到更具体的一个或另一个地址,如果您设置SO_REUSEADDR,则两者都可能在BSD中实现.您可以做的是,可以绑定到同一端口和两个不同的非通配符地址,这是始终允许的.在这方面,Linux比BSD更具限制性.

  1. As long as a listening (server) TCP socket is bound to a specific port, the SO_REUSEADDR option is entirely ignored for all sockets targeting that port. Binding a second socket to the same port is only possible if it was also possible in BSD without having SO_REUSEADDR set. E.g. you cannot bind to a wildcard address and then to a more specific one or the other way round, both is possible in BSD if you set SO_REUSEADDR. What you can do is you can bind to the same port and two different non-wildcard addresses, as that's always allowed. In this aspect Linux is more restrictive than BSD.

第二个例外是,对于客户端套接字,此选项的行为与BSD中的SO_REUSEPORT完全相同,只要它们在绑定前都设置了此标志即可.允许这样做的原因很简单,对于多种协议,能够将多个套接字完全绑定到同一UDP套接字地址是很重要的,并且在3.9之前以前没有SO_REUSEPORT,因此SO_REUSEADDR的行为是进行了相应的更改以填补这一空白.在这方面,Linux的限制不如BSD限制.

The second exception is that for client sockets, this option behaves exactly like SO_REUSEPORT in BSD, as long as both had this flag set before they were bound. The reason for allowing that was simply that it is important to be able to bind multiple sockets to exactly to the same UDP socket address for various protocols and as there used to be no SO_REUSEPORT prior to 3.9, the behavior of SO_REUSEADDR was altered accordingly to fill that gap. In that aspect Linux is less restrictive than BSD.

Linux> = 3.9

Linux 3.9也向Linux添加了选项SO_REUSEPORT.此选项的行为与BSD中的选项完全相同,只要所有套接字在绑定它们之前都设置了此选项,就可以绑定到完全相同的地址和端口号.

Linux >= 3.9

Linux 3.9 added the option SO_REUSEPORT to Linux as well. This option behaves exactly like the option in BSD and allows binding to exactly the same address and port number as long as all sockets have this option set prior to binding them.

但是,在其他系统上与SO_REUSEPORT仍然有两个区别:

Yet, there are still two differences to SO_REUSEPORT on other systems:

  1. 为防止端口劫持",有一个特殊的限制:所有要共享相同地址和端口组合的套接字必须属于共享相同有效用户ID的进程!因此,一个用户不能窃取"另一位用户的端口.这是一种特殊的魔术,可以在某种程度上补偿丢失的SO_EXCLBIND/SO_EXCLUSIVEADDRUSE标志.

  1. To prevent "port hijacking", there is one special limitation: All sockets that want to share the same address and port combination must belong to processes that share the same effective user ID! So one user cannot "steal" ports of another user. This is some special magic to somewhat compensate for the missing SO_EXCLBIND/SO_EXCLUSIVEADDRUSE flags.

此外,内核对SO_REUSEPORT套接字执行某些特殊魔术",而在其他操作系统中则找不到:对于UDP套接字,它尝试均匀地分发数据报;对于TCP侦听套接字,它尝试进行分发.共享相同地址和端口组合的所有套接字上的传入连接请求(通过调用accept()接受的连接请求).因此,应用程序可以轻松地在多个子进程中打开相同的端口,然后使用SO_REUSEPORT获得非常便宜的负载平衡.

Additionally the kernel performs some "special magic" for SO_REUSEPORT sockets that isn't found in other operating systems: For UDP sockets, it tries to distribute datagrams evenly, for TCP listening sockets, it tries to distribute incoming connect requests (those accepted by calling accept()) evenly across all the sockets that share the same address and port combination. Thus an application can easily open the same port in multiple child processes and then use SO_REUSEPORT to get a very inexpensive load balancing.


Android

即使整个Android系统与大多数Linux发行版都有些不同,但其核心工作是经过稍微修改的Linux内核,因此适用于Linux的所有内容也应适用于Android.


Android

Even though the whole Android system is somewhat different from most Linux distributions, at its core works a slightly modified Linux kernel, thus everything that applies to Linux should apply to Android as well.

Windows仅知道SO_REUSEADDR选项,没有SO_REUSEPORT.在Windows中的套接字上设置SO_REUSEADDR的行为类似于在BSD中在套接字上设置SO_REUSEPORTSO_REUSEADDR的情况,只有一个例外:具有SO_REUSEADDR的套接字可以始终绑定与已绑定的源地址和端口完全相同的源地址和端口.套接字,即使其他套接字在绑定时未设置此选项,也是如此.此行为有些危险,因为它允许应用程序窃取"另一个应用程序的连接端口.不用说,这可能会带来重大的安全隐患. Microsoft意识到这可能是个问题,因此添加了另一个套接字选项SO_EXCLUSIVEADDRUSE.在套接字上设置SO_EXCLUSIVEADDRUSE可确保如果绑定成功,则此套接字将专有地拥有源地址和端口的组合,并且即使设置了SO_REUSEADDR,其他套接字也无法绑定到它们.

Windows only knows the SO_REUSEADDR option, there is no SO_REUSEPORT. Setting SO_REUSEADDR on a socket in Windows behaves like setting SO_REUSEPORT and SO_REUSEADDR on a socket in BSD, with one exception: A socket with SO_REUSEADDR can always bind to exactly the same source address and port as an already bound socket, even if the other socket did not have this option set when it was bound. This behavior is somewhat dangerous because it allows an application "to steal" the connected port of another application. Needless to say, this can have major security implications. Microsoft realized that this might be a problem and thus added another socket option SO_EXCLUSIVEADDRUSE. Setting SO_EXCLUSIVEADDRUSE on a socket makes sure that if the binding succeeds, the combination of source address and port is owned exclusively by this socket and no other socket can bind to them, not even if it has SO_REUSEADDR set.

有关标志SO_REUSEADDRSO_EXCLUSIVEADDRUSE在Windows上如何工作,它们如何影响绑定/重新绑定的更多详细信息,Microsoft谨在该答复顶部附近提供了一个类似于我的表的表. 只需访问此页面并向下滚动一点.实际上有三个表,第一个表显示旧行为(Windows 2003以前的版本),第二个表显示行为(Windows 2003及更高版本),第三个表显示在Windows 2003及以后版本中如果bind()调用时行为如何变化由不同的用户制作.

For even more details on how the flags SO_REUSEADDR and SO_EXCLUSIVEADDRUSE work on Windows, how they influence binding/re-binding, Microsoft kindly provided a table similar to my table near the top of that reply. Just visit this page and scroll down a bit. Actually there are three tables, the first one shows the old behavior (prior Windows 2003), the second one the behavior (Windows 2003 and up) and the third one shows how the behavior changes in Windows 2003 and later if the bind() calls are made by different users.

Solaris是SunOS的后继产品. SunOS最初基于BSD的分支,SunOS 5后来基于SVR4的分支,但是SVR4是BSD,System V和Xenix的合并,因此在某种程度上,Solaris也是BSD的分支,并且相当早.结果,Solaris仅知道SO_REUSEADDR,而没有SO_REUSEPORT. SO_REUSEADDR的行为与BSD中的行为几乎相同.据我所知,没有办法获得与Solaris中的SO_REUSEPORT相同的行为,这意味着不可能将两个套接字绑定到完全相同的地址和端口.

Solaris is the successor of SunOS. SunOS was originally based on a fork of BSD, SunOS 5 and later was based on a fork of SVR4, however SVR4 is a merge of BSD, System V, and Xenix, so up to some degree Solaris is also a BSD fork, and a rather early one. As a result Solaris only knows SO_REUSEADDR, there is no SO_REUSEPORT. The SO_REUSEADDR behaves pretty much the same as it does in BSD. As far as I know there is no way to get the same behavior as SO_REUSEPORT in Solaris, that means it is not possible to bind two sockets to exactly the same address and port.

类似于Windows,Solaris可以选择为套接字提供排他绑定.此选项名为SO_EXCLBIND.如果在绑定套接字之前在套接字上设置了此选项,则如果测试两个套接字的地址冲突,则在另一个套接字上设置SO_REUSEADDR无效.例如.如果socketA绑定到通配符地址并且socketB启用了SO_REUSEADDR并绑定到非通配符地址并且与socketA相同的端口,则此绑定通常将成功,除非socketA具有启用,在这种情况下,无论socketBSO_REUSEADDR标志如何,它都将失败.

Similar to Windows, Solaris has an option to give a socket an exclusive binding. This option is named SO_EXCLBIND. If this option is set on a socket prior to binding it, setting SO_REUSEADDR on another socket has no effect if the two sockets are tested for an address conflict. E.g. if socketA is bound to a wildcard address and socketB has SO_REUSEADDR enabled and is bound to a non-wildcard address and the same port as socketA, this bind will normally succeed, unless socketA had SO_EXCLBIND enabled, in which case it will fail regardless the SO_REUSEADDR flag of socketB.

如果您的系统未在上面列出,我编写了一个小测试程序,您可以使用该程序来了解系统如何处理这两个选项. 如果您认为我的结果有误,请先运行该程序,然后再发表评论,甚至可能提出虚假声明.

In case your system is not listed above, I wrote a little test program that you can use to find out how your system handles these two options. Also if you think my results are wrong, please first run that program before posting any comments and possibly making false claims.

代码所需要构建的只是一点点POSIX API(用于网络部分)和一个C99编译器(实际上,大多数非C99编译器只要提供inttypes.hstdbool.h就可以正常工作;例如gcc在完全提供C99支持之前很早就支持了两者.

All that the code requires to build is a bit POSIX API (for the network parts) and a C99 compiler (actually most non-C99 compiler will work as well as long as they offer inttypes.h and stdbool.h; e.g. gcc supported both long before offering full C99 support).

程序需要运行的所有条件是,系统中的至少一个接口(本地接口除外)已分配了IP地址,并且设置了使用该接口的默认路由.该程序将收集该IP地址并将其用作第二个特定地址".

All that the program needs to run is that at least one interface in your system (other than the local interface) has an IP address assigned and that a default route is set which uses that interface. The program will gather that IP address and use it as the second "specific address".

它会测试您可能想到的所有可能的组合:

It tests all possible combinations you can think of:

  • TCP和UDP协议
  • 普通套接字,侦听(服务器)套接字,多播套接字
  • SO_REUSEADDR在套接字1,套接字2或两个套接字上设置
  • 在插槽1,插槽2或两个插槽上设置
  • SO_REUSEPORT
  • 您可以从0.0.0.0(通配符),127.0.0.1(特定地址)和在主接口上找到的第二个特定地址(对于多播,在所有测试中仅为224.1.2.3)组成的所有地址组合
  • TCP and UDP protocol
  • Normal sockets, listen (server) sockets, multicast sockets
  • SO_REUSEADDR set on socket1, socket2, or both sockets
  • SO_REUSEPORT set on socket1, socket2, or both sockets
  • All address combinations you can make out of 0.0.0.0 (wildcard), 127.0.0.1 (specific address), and the second specific address found at your primary interface (for multicast it's just 224.1.2.3 in all tests)

并将结果打印在一个漂亮的表中.它也可以在不知道SO_REUSEPORT的系统上工作,在这种情况下,该选项根本没有经过测试.

and prints the results in a nice table. It will also work on systems that don't know SO_REUSEPORT, in which case this option is simply not tested.

程序无法轻易测试的是SO_REUSEADDR如何对处于TIME_WAIT状态的套接字执行操作,因为强制并保持该状态的套接字非常棘手.幸运的是,大多数操作系统在这里看起来就像BSD,大多数时候程序员可以简单地忽略该状态的存在.

What the program cannot easily test is how SO_REUSEADDR acts on sockets in TIME_WAIT state as it's very tricky to force and keep a socket in that state. Fortunately most operating systems seems to simply behave like BSD here and most of the time programmers can simply ignore the existence of that state.

这是代码(我不能在此处添加该代码,答案有大小限制,代码会推送此回复超过限制).

Here's the code (I cannot include it here, answers have a size limit and the code would push this reply over the limit).

这篇关于SO_REUSEADDR和SO_REUSEPORT有何区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆