正常运行时间过长后,服务器端SignalR连接失败 [英] Server-side SignalR connection fails after significant uptime

查看:1168
本文介绍了正常运行时间过长后,服务器端SignalR连接失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在StackOverflow上搜索了许多其他与SignalR连接有关的问题,但似乎没有一个适用于我的具体情况.

I've searched numerous other questions related to SignalR connections on StackOverflow, but none of them seem to apply to my specific case.

我有一个使用SignalR集线器的应用程序.客户端可以使用两种方法连接到集线器:

I have an application that uses a SignalR hub. A client can connect to the hub using 2 methods:

  1. 通过使用基础客户端连接到集线器的.NET Core API
  2. 直接连接到中心的URL

我遇到的问题是使用.NET Core API的连接(方法1).当服务器端应用程序已运行相当长的时间(可能是2周)时,API使用的SignalR连接将失败.与SignalR集线器的直接连接(方法2)继续起作用.

The issue I'm having is with connection using the .NET Core API (method 1). When the server-side application has been running for a significant amount of time (maybe 2 weeks), the SignalR connection that the API uses fails. Direct connection to the SignalR hub (method 2) continues to work.

以下是通过API进行连接的方式:

Here's how connection works via the API:

.NET Core Web API

[Route("~/api/heartbeat")]
[HttpPost]
public async Task SendHeartbeat(nodeId) {
    await SignalRClient.SendHeartbeat(nodeId);
    ...
}

SignalRClient

public static class SignalRClient
{

    private static HubConnection _hubConnection;

    /// <summary>
    /// Static SignalRHub client - to ensure that a single connection to the SignalRHub is re-used,
    /// and to prevent excessive connections that cause SignalR to fail
    /// </summary>
    static SignalRClient()
    {
        string signalRHubUrl = "...someUrl";

        _hubConnection = new HubConnectionBuilder()
        .WithUrl(signalRHubUrl)
        .Build();

        _hubConnection.Closed += async (error) =>
        {
            Log.Error("SignalR hub connection was closed - reconnecting. Error message - " + error.Message);

            await Task.Delay(new Random().Next(0, 5) * 1000);
            try
            {
                Log.Error("About to reconnect");
                await _hubConnection.StartAsync();
                Log.Error("Reconnect now requested");
            }
            catch (Exception ex)
            {
                Log.Error("Failed to restart connection to SignalR hub, following a disconnection: " + ex.Message);
            }
        };

        InitializeConnection();
    }

    private static async void InitializeConnection()
    {
        try
        {
            Log.Information("Checking hub connection status");
            if (_hubConnection.State == HubConnectionState.Disconnected)
            {
                Log.Information($"Starting SignalRClient using signalRHubUrl");
                await _hubConnection.StartAsync();
                Log.Information("SignalRClient started successfully");
            }
        }
        catch (Exception ex)
        {
            Log.Error("Failed to start connection to SignalRClient : " + ex.Message + ", " + ex.InnerException.Message);
        }
    }

    public static async Task SendHeartbeat(string nodeId)
    {
        try
        {
            Log.Information("Attempting to send heartbeat to SignalRHub");
            await _hubConnection.InvokeAsync("SendNodeHeartbeatToMonitors", nodeId);
        }
        catch (Exception ex)
        {
            Log.Error($"Error when sending heartbeat to SignalRClient  for NodeId: {nodeId}. Error: {ex.Message}");
        }
    }

大约2周的正常运行时间后,连接失败并且没有恢复,我可以在日志中看到错误:

After uptime of about 2 weeks, the connection fails and doesn't recover, I can see an error in the log:

Error when sending transaction to SignalRClient from /api/heartbeat: The 'InvokeCoreAsync' method cannot be called if the connection is not active

我不知道这是怎么发生的,因为我正在使用SignalRClient中的_hubConnection.Closed方法来处理关闭连接时的情况,然后执行await _hubConnection.StartAsync();重新启动连接,例如如上面的代码所示.

I don't understand how this is happening, as I'm using the _hubConnection.Closed method in the SignalRClient to handle the case when a connection is closed, which then executes await _hubConnection.StartAsync(); to restart the connection, as shown in the code above.

由于某种原因(每30分钟一次),定期关闭连接,但通常会恢复连接,我在日志中看到以下错误:

The connection is regularly being closed for some reason (every 30mins), but it usually recovers the connection, and I see the following error in the log:

SignalR hub connection was closed - reconnecting. Error message - The remote party closed the WebSocket connection without completing the close handshake.

这表明代码已成功输入_hubConnection.Closed方法(因为这是我记录该消息的位置),因此看来连接通常已成功重新启动.

This shows that the code is successfully entering the _hubConnection.Closed method (as this is where I log that message), so it appear that the connection is usually restarted successfully.

那么,为什么有时连接会完全失败却又无法重新启动?我想知道我是否以一种明智的方式连接到SignalR集线器(特别是,我想知道对SignalRClient使用静态类是否是一种很好的模式).我想知道我的实际问题是否是所有这些The remote party closed the WebSocket connection without completing the close handshake.错误?如果是这样,可能是什么原因造成的?

So, why does the connection sometimes fail completely but then fail to be restarted? I'm wondering if I'm connecting to the SignalR hub in a sensible way (in particularly, I'm wondering if using a static class for the SignalRClient is a good pattern). And I'm wondering if my actual problem is all of those The remote party closed the WebSocket connection without completing the close handshake. errors? If that's the case, what could be causing those?

任何向我指出正确方向的建议都将受到赞赏.

Any suggestions that point me in the right direction are greatly appreciated.

推荐答案

几年前,我遇到了同样的问题,当时我通过将对StartAsync的所有调用置于他们自己的任务中来解决了这个问题.虽然对此我可能是错的,但我自己的实验表明HubConnection本身不可重用,因此在断开连接后也需要重新创建.

I encountered this same problem a few years ago, which I solved at the time by placing all calls to StartAsync in their own task. And while I could be wrong about this, my own experiments indicated that the HubConnection itself isn't reusable, and thus also needs to be recreated after a disconnect.

因此,从本质上讲,我有一个名为"CreateHubConnection"的函数,它可以实现您的期望,并且我有一个异步方法来启动服务器连接,如下所示:

So essetentially I have an function called "CreateHubConnection" which does what you'd expect it to, and I have an async method to initiate server connections that looks like this:

private async Task ConnectToServer()
{
    // keep trying until we manage to connect
    while (true)
    {
        try
        {
            await CreateHubConnection();
            await this.Connection.StartAsync();
            return; // yay! connected
        }
        catch (Exception e) { /* bugger! */}
    }
}

我的初始连接在新任务中运行它:

My initial connection runs this in a new task:

this.Cancel = new CancellationTokenSource();
Task.Run(async () => await ConnectToServer(), this.Cancel.Token);

Connection.Closed处理程序还会在新任务中启动它:

And the Connection.Closed handler also launches it in a new task:

this.Connection.Closed += async () => 
{
    try
    {
        await Task.Delay(1000); // don't want to hammer the network
        this.Cancel = new CancellationTokenSource();
        await Task.Run(async () => await ConnectToServer(), this.Cancel.Token);
    }
    catch (Exception _e) { /* give up */ }
}

我不知道为什么这是必要的,但是直接从Closed处理程序调用StartAsync似乎在SignalR库中创建了某种死锁.我从来没有找到确切的原因.....这可能是因为我最初对StartAsync的调用是由GUI线程调用的.将连接放在自己的线程中,每次创建新的HubConnection,并处理不再需要的旧HubConnection.

I don't know why this is necessary, but calling StartAsync directly from the Closed handler seems to create some kind of deadlock inside the SignalR library. I never did track down the exact cause for this.....it could have been because my original call to StartAsync was being called by the GUI thread. Putting connections in their own threads, creating new HubConnections each time, and disposing old HubConnections that were no longer needed fixed it.

如果对此有更多了解的人有更好/更轻松的解决方案,将会非常感兴趣.

Would be very interested if someone with more knowledge of this has a better/easier solution.

这篇关于正常运行时间过长后,服务器端SignalR连接失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆