尽管连接上保持活动和活动,.NET WebSockets 仍被强行关闭 [英] .NET WebSockets forcibly closed despite keep-alive and activity on the connection

查看:24
本文介绍了尽管连接上保持活动和活动,.NET WebSockets 仍被强行关闭的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们已经使用 System.Net.WebSockets 编写了一个简单的 WebSocket 客户端.ClientWebSocket 上的 KeepAliveInterval 设置为 30 秒.

We have written a simple WebSocket client using System.Net.WebSockets. The KeepAliveInterval on the ClientWebSocket is set to 30 seconds.

连接成功打开并且双向流量按预期流动,或者如果连接空闲,客户端每 30 秒向服务器发送一次 Pong 请求(在 Wireshark 中可见).

The connection is opened successfully and traffic flows as expected in both directions, or if the connection is idle, the client sends Pong requests every 30 seconds to the server (visible in Wireshark).

但是在 100 秒后,由于 TCP 套接字在客户端关闭(在 Wireshark 中观察我们看到客户端发送一个 FIN),连接突然终止.服务器在关闭套接字之前以 1001 Going Away 响应.

But after 100 seconds the connection is abruptly terminated due to the TCP socket being closed at the client end (watching in Wireshark we see the client send a FIN). The server responds with a 1001 Going Away before closing the socket.

经过大量挖掘,我们找到了原因,并找到了一个相当严厉的解决方法.尽管在 Google 和 Stack Overflow 上进行了大量搜索,但我们只看到了一些其他人发布有关该问题的示例,但没有人给出答案,因此我发布此信息是为了避免其他人的痛苦,并希望有人能够提出更好的解决方法.

After a lot of digging we have tracked down the cause and found a rather heavy-handed workaround. Despite a lot of Google and Stack Overflow searching we have only seen a couple of other examples of people posting about the problem and nobody with an answer, so I'm posting this to save others the pain and in the hope that someone may be able to suggest a better workaround.

100 秒超时的来源是 WebSocket 使用 System.Net.ServicePoint,它具有 MaxIdleTime 属性以允许关闭空闲套接字.在打开 WebSocket 时,如果 Uri 有一个现有的 ServicePoint,它将使用它,无论 MaxIdleTime 属性在创建时设置为什么.如果没有,将创建一个新的 ServicePoint 实例,并根据 System.Net.ServicePointManager MaxServicePointIdleTime 属性的当前值(默认为 100,000 毫秒)设置 MaxIdleTime.

The source of the 100 second timeout is that the WebSocket uses a System.Net.ServicePoint, which has a MaxIdleTime property to allow idle sockets to be closed. On opening the WebSocket if there is an existing ServicePoint for the Uri it will use that, with whatever the MaxIdleTime property was set to on creation. If not, a new ServicePoint instance will be created, with MaxIdleTime set from the current value of the System.Net.ServicePointManager MaxServicePointIdleTime property (which defaults to 100,000 milliseconds).

问题在于,就 ServicePoint 空闲计时器而言,WebSocket 流量和 WebSocket 保持连接(Ping/Pong)似乎都没有注册为流量.因此,在打开 WebSocket 后恰好 100 秒它就会被拆除,尽管有流量或保持活动状态.

The issue is that neither WebSocket traffic nor WebSocket keep-alives (Ping/Pong) appear to register as traffic as far as the ServicePoint idle timer is concerned. So exactly 100 seconds after opening the WebSocket it just gets torn down, despite traffic or keep-alives.

我们的预感是,这可能是因为 WebSocket 以 HTTP 请求开始,然后升级为 WebSocket.看来空闲计时器只是在寻找 HTTP 流量.如果这确实是正在发生的事情,那么这似乎是 System.Net.WebSockets 实现中的一个主要错误.

Our hunch is that this may be because the WebSocket starts life as an HTTP request which is then upgraded to a websocket. It appears that the idle timer is only looking for HTTP traffic. If that is indeed what is happening that seems like a major bug in the System.Net.WebSockets implementation.

我们使用的解决方法是将 ServicePoint 上的 MaxIdleTime 设置为 int.MaxValue.这允许 WebSocket 无限期地保持打开状态.但缺点是此值适用于该 ServicePoint 的任何其他连接.在我们的上下文中(这是一个使用 Visual Studio Web 和负载测试的负载测试),我们为同一个 ServicePoint 打开了其他 (HTTP) 连接,实际上在我们打开 WebSocket 时已经有一个活动的 ServicePoint 实例.这意味着在我们更新 MaxIdleTime 后,负载测试的所有 HTTP 连接将没有空闲超时.这让人感觉不太舒服,尽管实际上 Web 服务器无论如何都应该关闭空闲连接.

The workaround we are using is to set the MaxIdleTime on the ServicePoint to int.MaxValue. This allows the WebSocket to stay open indefinitely. But the downside is that this value applies to any other connections for that ServicePoint. In our context (which is a Load test using Visual Studio Web and Load testing) we have other (HTTP) connections open for the same ServicePoint, and in fact there is already an active ServicePoint instance by the time that we open our WebSocket. This means that after we update the MaxIdleTime, all HTTP connections for the Load test will have no idle timeout. This doesn't feel quite comfortable, although in practice the web server should be closing idle connections anyway.

我们还简要探讨了是否可以创建一个新的 ServicePoint 实例,仅为我们的 WebSocket 连接保留,但看不到一种干净的方法.

We also briefly explore whether we could create a new ServicePoint instance reserved just for our WebSocket connection, but couldn't see a clean way of doing that.

另一个让这更难追踪的小变化是,虽然 System.Net.ServicePointManager MaxServicePointIdleTime 属性默认为 100 秒,但 Visual Studio 正在覆盖此值并将其设置为 120 秒 - 这使得搜索变得更加困难.

One other little twist which made this harder to track down is that although the System.Net.ServicePointManager MaxServicePointIdleTime property defaults to 100 seconds, Visual Studio is overriding this value and setting it to 120 seconds - which made it harder to search for.

推荐答案

我这周遇到了这个问题.您的解决方法让我指明了正确的方向,但我相信我已经缩小了根本原因.

I ran into this issue this week. Your workaround got me pointed in the right direction, but I believe I've narrowed down the root cause.

如果来自 WebSocket 服务器的101 交换协议"响应中包含Content-Length: 0"标头,WebSocketClient 会感到困惑并安排在 100 秒内清理连接.

If a "Content-Length: 0" header is included in the "101 Switching Protocols" response from a WebSocket server, WebSocketClient gets confused and schedules the connection for cleanup in 100 seconds.

这是来自 的违规代码.Net 参考来源:

//if the returned contentlength is zero, preemptively invoke calldone on the stream.
//this will wake up any pending reads.
if (m_ContentLength == 0 && m_ConnectStream is ConnectStream) {
    ((ConnectStream)m_ConnectStream).CallDone();
}

根据 RFC 7230 第 3.3.2 节,1xx(信息)消息中禁止使用 Content-Length,但我发现它错误地包含在某些服务器实现中.

According to RFC 7230 Section 3.3.2, Content-Length is prohibited in 1xx (Informational) messages, but I've found it mistakenly included in some server implementations.

有关其他详细信息,包括一些用于诊断 ServicePoint 问题的示例代码,请参阅此线程:https://github.com/ably/ably-dotnet/issues/107

For additional details, including some sample code for diagnosing ServicePoint issues, see this thread: https://github.com/ably/ably-dotnet/issues/107

这篇关于尽管连接上保持活动和活动,.NET WebSockets 仍被强行关闭的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆