与网络物理断开和间歇性套接字错误 10057 [英] Physical disconnection from network and intermittent socket error 10057

查看:34
本文介绍了与网络物理断开和间歇性套接字错误 10057的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的一个客户有一个 Windows 应用程序,其中两台机器之间有网络连接.系统应该处理丢失的连接.它通过在客户端位置上保留一个计数器来实现这一点,每次从服务器接收到数据时,该计数器都会重置.如果计数器达到 60 秒(即我们有 60 秒没有收到服务器的消息),它会执行一些预期的操作来应对连接丢失.

A customer of mine has a Windows application where there is a network connection between two machines. The system is supposed to cope with the connection being lost. It does this by keeping a counter on the client position which is reset every time data is received from the server. If the counter reaches 60 seconds (i.e. we haven't heard from the server for 60 seconds) it performs some expected action to cope with the connection being lost.

然而,客户有一个问题,有时连接会丢失,但客户端没有执行预期的操作.经过调查,这似乎是由于客户端与服务器的套接字有时会在连接丢失时引发错误 10057 (WSAENOTCONN/"Socket is not connected") 引起的间歇性问题.因为客户端在收到套接字错误时的行为不同,所以客户在收到此套接字错误时不会得到所需的行为.这对我来说并不难解决,但我对不同的行为感到有些困惑.

The customer has a problem, however, where sometimes the connection will be lost but the client doesn't perform the expected action. Upon investigation, it appears that this is an intermittent problem caused by the client's socket to the server sometimes raising error 10057 (WSAENOTCONN / "Socket is not connected") when the connection is lost. Because the client behaves differently when it gets a socket error the customer doesn't get the desired behaviour when they get this socket error. This is not difficult for me to fix, but I am a bit puzzled by the different behaviour.

为了重现该问题,我将网络电缆从服务器机器的背面拉出.大多数情况下,对客户端的影响是我们不会通过套接字获取任何数据,也不会出现错误.然而,有一部分时间会引发错误 10057.任何人都可以解释为什么会出现这种不一致?客户端套接字是一个非阻塞的 STREAM 套接字.

To reproduce the problem I'm physically pulling the network cable out of the back of my server machine. The majority of the time, the effect on the client side is that we just don't get any data over the socket, and we don't get an error. Some fraction of the time however error 10057 is raised. Can anyone shed any light on why there is this inconsistency? The client socket is a nonblocking STREAM socket.

推荐答案

我预计只有在尝试发送某些内容时才会出现错误.那是 TCP 连接发现它无法到达另一个端点的时候.这将花费不同的时间来发现故障,具体取决于网络往返时间.可能有一个保持活动"选项,强制套接字即使在应用空闲时也定期发送一些东西来检测故障.

I would expect you would get an error only if you try to send something. That is when the TCP connection would discover it can't reach the other end point. This will take a variable amount of time to discover the failure, depending on the network round trip time. There might be a "keep alive" option, that forces the socket to periodically send something to detect failure even when app is idle.

这篇关于与网络物理断开和间歇性套接字错误 10057的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆