Websocket传输可靠性(重新连接期间Socket.io数据丢失) [英] Websocket transport reliability (Socket.io data loss during reconnection)

查看:143
本文介绍了Websocket传输可靠性(重新连接期间Socket.io数据丢失)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

NodeJS,Socket.io

NodeJS, Socket.io

想象一下,有2个用户 U1 & U2 ,通过Socket.io连接到应用.算法如下:

Imagine there are 2 users U1 & U2, connected to an app via Socket.io. The algorithm is the following:

  1. U1 完全失去Internet连接(例如,关闭Internet)
  2. U2 U1 发送一条消息.
  3. U1 尚未收到该消息,因为Internet断开
  4. 服务器通过心跳超时检测到 U1 断开连接
  5. U1 重新连接到socket.io
  6. U1 从未收到来自 U2 的消息-我猜它在第4步中丢失了.
  1. U1 completely loses Internet connection (ex. switches Internet off)
  2. U2 sends a message to U1.
  3. U1 does not receive the message yet, because the Internet is down
  4. Server detects U1 disconnection by heartbeat timeout
  5. U1 reconnects to socket.io
  6. U1 never receives the message from U2 - it is lost on Step 4 I guess.

可能的解释

我想我理解为什么会发生:

Possible explanation

I think I understand why it happens:

  • 在步骤4 服务器上,杀死套接字实例以及发送到 U1 的消息队列
  • 此外,在第5步 U1 服务器中创建新的连接(不会重用),因此即使消息仍在排队中,上一个连接还是会丢失. /li>
  • on Step 4 Server kills socket instance and the queue of messages to U1 as well
  • Moreover on Step 5 U1 and Server create new connection (it is not reused), so even if message is still queued, the previous connection is lost anyway.

如何防止此类数据丢失?我必须使用心跳,因为我不会有人永远挂在应用程序中.另外,我还必须提供重新连接的可能性,因为当我部署新版本的应用程序时,我希望停机时间为零.

How can I prevent this kind of data loss? I have to use hearbeats, because I do not people hang in app forever. Also I must still give a possibility to reconnect, because when I deploy a new version of app I want zero downtime.

P.S.我称之为消息"的东西不仅是我可以存储在数据库中的文本消息,而且是有价值的系统消息,必须保证其传递,否则UI就会搞砸.

P.S. The thing I call "message" is not just a text message I can store in database, but valuable system message, which delivery must be guaranteed, or UI screws up.

谢谢!

我已经有一个用户帐户系统.而且,我的应用程序已经很复杂.添加离线/在线状态将无济于事,因为我已经有了这种东西.问题是不同的.

I do already have a user account system. Moreover, my application is already complex. Adding offline/online statuses won't help, because I already have this kind of stuff. The problem is different.

签出第2步.在技术上,我们不能说出U1是否脱机,他只是失去了连接,说了2秒钟,这可能是因为互联网状况不佳.因此,U2向他发送了一条消息,但是U1没有收到该消息,因为互联网对他来说仍然不可用(步骤3).需要步骤4来检测脱机用户,可以说超时是60秒.最终在另外10秒内,U1的互联网连接建立,他重新连接到socket.io.但是来自U2的消息在空间中丢失,因为服务器U1上的超时已将其断开连接.

Check out step 2. On this step we technically cannot say if U1 goes offline, he just loses connection lets say for 2 seconds, probably because of bad internet. So U2 sends him a message, but U1 doesn't receive it because internet is still down for him (step 3). Step 4 is needed to detect offline users, lets say, the timeout is 60 seconds. Eventually in another 10 seconds internet connection for U1 is up and he reconnects to socket.io. But the message from U2 is lost in space because on server U1 was disconnected by timeout.

那是问题,我不能100%交货.

That is the problem, I wan't 100% delivery.

  1. 在{}用户中收集一个发射(发射名称和数据),由随机的emitID标识.发送发射
  2. 在客户端确认发射(通过发射ID将发射发送回服务器)
  3. 如果已确认,请从{}中删除由emitID标识的对象
  4. 如果用户重新连接-请为此用户检查{},并在其中循环执行{}中每个对象的步骤1
  5. 在断开连接或/和/或连接时刷新{},如有必要,请为用户

// Server
const pendingEmits = {};

socket.on('reconnection', () => resendAllPendingLimits);
socket.on('confirm', (emitID) => { delete(pendingEmits[emitID]); });

// Client
socket.on('something', () => {
    socket.emit('confirm', emitID);
});

解决方案2(有点)

添加了2020年2月1日.

虽然这并不是Websocket的真正解决方案,但仍然有人可以使用.我们从Websockets迁移到SSE + Ajax. SSE允许您从客户端进行连接以保持持久的TCP连接并实时接收来自服务器的消息.要将消息从客户端发送到服务器-只需使用Ajax.存在诸如延迟和开销之类的缺点,但是SSE保证了可靠性,因为它是TCP连接.

While this is not really a solution for Websockets, someone may still find it handy. We migrated from Websockets to SSE + Ajax. SSE allows you to connect from a client to keep a persistent TCP connection and receive messages from a server in realtime. To send messages from a client to a server - simply use Ajax. There are disadvantages like latency and overhead, but SSE guarantees reliability because it is a TCP connection.

因为我们使用Express,所以我们将此库用于SSE https://github.com/dpskvn/express -sse ,但您可以选择适合自己的一个.

Since we use Express we use this library for SSE https://github.com/dpskvn/express-sse, but you can choose the one that fits you.

SSE,因此您需要使用polyfill: https://github .com/Yaffle/EventSource .

SSE is not supported in IE and most Edge versions, so you would need a polyfill: https://github.com/Yaffle/EventSource.

推荐答案

其他人在其他答案和评论中对此进行了提示,但是根本问题是Socket.IO只是一种传递机制,而您不能仅依靠它来可靠地交付.唯一知道消息已成功发送给客户的人就是客户本身.对于这种系统,我建议做出以下断言:

Others have hinted at this in other answers and comments, but the root problem is that Socket.IO is just a delivery mechanism, and you cannot depend on it alone for reliable delivery. The only person who knows for sure that a message has been successfully delivered to the client is the client itself. For this kind of system, I would recommend making the following assertions:

  1. 消息不会直接发送给客户端;而是将它们发送到服务器并存储在某种数据存储中.
  2. 客户端负责在重新连接时询问我错过了什么",并将查询数据存储区中存储的消息以更新其状态.
  3. 如果在收件人客户端已连接时将消息发送到服务器 ,则该消息将实时发送到客户端.
  1. Messages aren't sent directly to clients; instead, they get sent to the server and stored in some kind of data store.
  2. Clients are responsible for asking "what did I miss" when they reconnect, and will query the stored messages in the data store to update their state.
  3. If a message is sent to the server while the recipient client is connected, that message will be sent in real time to the client.

当然,根据应用程序的需求,您可以对此进行调整-例如,您可以使用Redis列表或消息的排序集,并在知道事实的情况下清除它们.客户是最新的.

Of course, depending on your application's needs, you can tune pieces of this--for example, you can use, say, a Redis list or sorted set for the messages, and clear them out if you know for a fact a client is up to date.

以下是几个示例:

快乐之路:

  • U1和U2都已连接到系统.
  • U2向服务器发送一条消息,U1应该接收该消息.
  • 服务器将消息存储在某种持久性存储中,并使用某种时间戳或顺序ID将其标记为U1.
  • 服务器通过Socket.IO将消息发送到U1.
  • U1的客户端确认(也许通过Socket.IO回调)它已收到消息.
  • 服务器从数据存储中删除保留的消息.

离线路径:

  • U1断开了Internet连接.
  • U2向服务器发送一条消息,U1应该接收该消息.
  • 服务器将消息存储在某种持久性存储中,并使用某种时间戳或顺序ID将其标记为U1.
  • 服务器通过Socket.IO将消息发送到U1.
  • U1的客户端不确认,因为他们处于脱机状态.
  • 也许U2向U1发送了一些消息;它们都以相同的方式存储在数据存储中.
  • U1重新连接时,它询问服务器我看到的最后一条消息是X/我的状态为X,我错过了什么."
  • 服务器根据U1的请求向U1发送从数据存储中丢失的所有消息
  • U1的客户端确认收到,服务器从数据存储中删除这些消息.
  • U1 looses internet connectivity.
  • U2 sends a message to the server that U1 should receive.
  • The server stores the message in some kind of persistent store, marking it for U1 with some kind of timestamp or sequential ID.
  • The server sends the message to U1 via Socket.IO.
  • U1's client does not confirm receipt, because they are offline.
  • Perhaps U2 sends U1 a few more messages; they all get stored in the data store in the same fashion.
  • When U1 reconnects, it asks the server "The last message I saw was X / I have state X, what did I miss."
  • The server sends U1 all the messages it missed from the data store based on U1's request
  • U1's client confirms receipt and the server removes those messages from the data store.

如果您绝对希望有保证的交付,那么设计系统就很重要,即连接实际上并不重要,并且实时交付只是一个奖励;这几乎总是涉及某种数据存储.正如user568109在评论中提到的那样,有些消息传递系统可以抽象化所述消息的存储和传递,因此值得研究这种预构建的解决方案. (您可能仍然需要自己编写Socket.IO集成.)

If you absolutely want guaranteed delivery, then it's important to design your system in such a way that being connected doesn't actually matter, and that realtime delivery is simply a bonus; this almost always involves a data store of some kind. As user568109 mentioned in a comment, there are messaging systems that abstract away the storage and delivery of said messages, and it may be worth looking into such a prebuilt solution. (You will likely still have to write the Socket.IO integration yourself.)

如果您不希望将消息存储在数据库中,则可以摆脱将消息存储在本地数组中的麻烦.服务器尝试向U1发送消息,并将其存储在待处理消息"列表中,直到U1的客户端确认它已收到为止.如果客户端处于脱机状态,则当客户端返回时,它可以告诉服务器嘿,我已断开连接,请将任何我想念的内容都发送给我",服务器可以遍历这些消息.

If you're not interested in storing the messages in the database, you may be able to get away with storing them in a local array; the server tries to send U1 the message, and stores it in a list of "pending messages" until U1's client confirms that it received it. If the client is offline, then when it comes back it can tell the server "Hey I was disconnected, please send me anything I missed" and the server can iterate through those messages.

幸运的是,Socket.IO提供了一种机制,该机制允许客户端响应"看起来像本机JS回调的消息.这是一些伪代码:

Luckily, Socket.IO provides a mechanism that allows a client to "respond" to a message that looks like native JS callbacks. Here is some pseudocode:

// server
pendingMessagesForSocket = [];

function sendMessage(message) {
  pendingMessagesForSocket.push(message);
  socket.emit('message', message, function() {
    pendingMessagesForSocket.remove(message);
  }
};

socket.on('reconnection', function(lastKnownMessage) {
  // you may want to make sure you resend them in order, or one at a time, etc.
  for (message in pendingMessagesForSocket since lastKnownMessage) {
    socket.emit('message', message, function() {
      pendingMessagesForSocket.remove(message);
    }
  }
});

// client
socket.on('connection', function() {
  if (previouslyConnected) {
    socket.emit('reconnection', lastKnownMessage);
  } else {
    // first connection; any further connections means we disconnected
    previouslyConnected = true;
  }
});

socket.on('message', function(data, callback) {
  // Do something with `data`
  lastKnownMessage = data;
  callback(); // confirm we received the message
});

这与上一个建议非常相似,只是没有持久的数据存储.

This is quite similar to the last suggestion, simply without a persistent data store.

您可能还对事件源的概念感兴趣.

You may also be interested in the concept of event sourcing.

这篇关于Websocket传输可靠性(重新连接期间Socket.io数据丢失)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆