尽管保持活动和连接活动,.NET WebSocket仍被强制关闭 [英] .NET WebSockets forcibly closed despite keep-alive and activity on the connection

查看:362
本文介绍了尽管保持活动和连接活动,.NET WebSocket仍被强制关闭的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们使用System.Net.WebSockets编写了一个简单的WebSocket客户端。 ClientWebSocket上的KeepAliveInterval设置为30秒。

We have written a simple WebSocket client using System.Net.WebSockets. The KeepAliveInterval on the ClientWebSocket is set to 30 seconds.

连接成功打开,并且流量按预期的方向双向流动,或者如果连接空闲,则客户端发送Pong每30秒向服务器请求一次(在Wireshark中可见)。

The connection is opened successfully and traffic flows as expected in both directions, or if the connection is idle, the client sends Pong requests every 30 seconds to the server (visible in Wireshark).

但是100秒后,由于TCP套接字在客户端关闭,连接突然终止(在Wireshark中观看,我们看到客户端发送了FIN)。在关闭套接字之前,服务器以1001消失作为响应。

But after 100 seconds the connection is abruptly terminated due to the TCP socket being closed at the client end (watching in Wireshark we see the client send a FIN). The server responds with a 1001 Going Away before closing the socket.

经过大量挖掘,我们找到了原因,并找到了一个比较费力的解决方法。尽管Google和Stack Overflow进行了大量搜索,但我们仅看到了几个其他示例,其中有人发布了有关该问题的信息,而没有人提供答案,因此,我发布此信息是为了减轻其他人的痛苦,并希望有人可以

After a lot of digging we have tracked down the cause and found a rather heavy-handed workaround. Despite a lot of Google and Stack Overflow searching we have only seen a couple of other examples of people posting about the problem and nobody with an answer, so I'm posting this to save others the pain and in the hope that someone may be able to suggest a better workaround.

100秒超时的来源是WebSocket使用System.Net.ServicePoint,它具有MaxIdleTime属性以允许空闲套接字被关闭。在打开WebSocket时,如果Uri有现有的ServicePoint,它将使用它,无论在创建时将MaxIdleTime属性设置为什么。否则,将创建一个新的ServicePoint实例,并从System.Net.ServicePointManager MaxServicePointIdleTime属性的当前值(默认值为100,000毫秒)中设置MaxIdleTime。

The source of the 100 second timeout is that the WebSocket uses a System.Net.ServicePoint, which has a MaxIdleTime property to allow idle sockets to be closed. On opening the WebSocket if there is an existing ServicePoint for the Uri it will use that, with whatever the MaxIdleTime property was set to on creation. If not, a new ServicePoint instance will be created, with MaxIdleTime set from the current value of the System.Net.ServicePointManager MaxServicePointIdleTime property (which defaults to 100,000 milliseconds).

问题在于,就ServicePoint空闲计时器而言,WebSocket流量和WebSocket保持活动(Ping / Pong)都不会注册为流量。因此,恰好在打开WebSocket的100秒后,尽管有流量或保持活动,它还是被拆除了。

The issue is that neither WebSocket traffic nor WebSocket keep-alives (Ping/Pong) appear to register as traffic as far as the ServicePoint idle timer is concerned. So exactly 100 seconds after opening the WebSocket it just gets torn down, despite traffic or keep-alives.

我们的直觉是,这可能是因为WebSocket开始以HTTP请求,然后将其升级到Websocket。看来,空闲计时器仅在寻找HTTP通信。如果确实如此,那么这似乎是System.Net.WebSockets实现中的主要错误。

Our hunch is that this may be because the WebSocket starts life as an HTTP request which is then upgraded to a websocket. It appears that the idle timer is only looking for HTTP traffic. If that is indeed what is happening that seems like a major bug in the System.Net.WebSockets implementation.

我们正在使用的解决方法是在ServicePoint上设置MaxIdleTime到int.MaxValue。这允许WebSocket无限期保持打开状态。但是不利的是,该值适用于该ServicePoint的任何其他连接。在我们的上下文中(这是使用Visual Studio Web进行的负载测试和负载测试),我们为同一ServicePoint打开了其他(HTTP)连接,并且实际上,在打开WebSocket时已经有一个活动的ServicePoint实例。这意味着在更新MaxIdleTime之后,用于负载测试的所有HTTP连接都将没有空闲超时。尽管实际上Web服务器实际上无论如何都应关闭空闲连接,但这种感觉并不舒服。

The workaround we are using is to set the MaxIdleTime on the ServicePoint to int.MaxValue. This allows the WebSocket to stay open indefinitely. But the downside is that this value applies to any other connections for that ServicePoint. In our context (which is a Load test using Visual Studio Web and Load testing) we have other (HTTP) connections open for the same ServicePoint, and in fact there is already an active ServicePoint instance by the time that we open our WebSocket. This means that after we update the MaxIdleTime, all HTTP connections for the Load test will have no idle timeout. This doesn't feel quite comfortable, although in practice the web server should be closing idle connections anyway.

我们还简要地探讨了是否可以创建仅保留一个新的ServicePoint实例。

We also briefly explore whether we could create a new ServicePoint instance reserved just for our WebSocket connection, but couldn't see a clean way of doing that.

另一个使我们很难跟踪的小变化是,尽管System.Net .ServicePointManager的MaxServicePointIdleTime属性默认值为100秒,Visual Studio会覆盖此值并将其设置为120秒,这使搜索变得更加困难。

One other little twist which made this harder to track down is that although the System.Net.ServicePointManager MaxServicePointIdleTime property defaults to 100 seconds, Visual Studio is overriding this value and setting it to 120 seconds - which made it harder to search for.

推荐答案

这周我遇到了这个问题。您的变通方法可以使我指明正确的方向,但我相信我已经缩小了根本原因。

I ran into this issue this week. Your workaround got me pointed in the right direction, but I believe I've narrowed down the root cause.

如果在标题中包含 Content-Length:0标头来自WebSocket服务器的 101交换协议响应,WebSocketClient感到困惑,并安排连接在100秒内进行清理。

If a "Content-Length: 0" header is included in the "101 Switching Protocols" response from a WebSocket server, WebSocketClient gets confused and schedules the connection for cleanup in 100 seconds.

这是。Net参考源

//if the returned contentlength is zero, preemptively invoke calldone on the stream.
//this will wake up any pending reads.
if (m_ContentLength == 0 && m_ConnectStream is ConnectStream) {
    ((ConnectStream)m_ConnectStream).CallDone();
}

根据RFC 7230第3.3.2节,内容长度禁止在1xx中使用(信息性)消息,但我发现它错误地包含在某些服务器实现中。

According to RFC 7230 Section 3.3.2, Content-Length is prohibited in 1xx (Informational) messages, but I've found it mistakenly included in some server implementations.

有关其他详细信息,包括一些用于诊断ServicePoint问题的示例代码,请参见以下线程: https://github.com/ably/ABLE-dotnet/issues/107

For additional details, including some sample code for diagnosing ServicePoint issues, see this thread: https://github.com/ably/ably-dotnet/issues/107

这篇关于尽管保持活动和连接活动,.NET WebSocket仍被强制关闭的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆