基于 TCP 的 Node.JS 无界并发/流背压 [英] Node.JS Unbounded Concurrency / Stream backpressure over TCP

查看:35
本文介绍了基于 TCP 的 Node.JS 无界并发/流背压的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

据我所知,Node 事件 IO 模型的后果之一是无法告诉(例如)通过 TCP 套接字接收数据的 Node 进程在您连接接收事件后进行阻塞处理程序(或以其他方式开始侦听数据).

As I understand it, one of the consequences of Node's evented IO model is the inability to tell a Node process that is (for example) receiving data over a TCP socket, to block, once you've hooked up your receiving event handlers (or otherwise started listening for data).

如果接收方不能足够快地处理传入的数据,则会导致无界并发",即后台节点继续尽可能快地从套接字读取数据,并在事件上安排新的数据事件在套接字上循环而不是阻塞,直到进程最终耗尽内存并死亡.

If the receiver can't process the incoming data fast enough, "unbounded concurrency" can result, whereby node under-the-hood continues to read data off the socket as fast as it can, scheduling new data events on the event loop instead of block on the socket, until the process eventually runs out of memory and dies.

接收方不能告诉节点放慢它的读取速度,否则 TCP 的内置流量控制机制会启动并指示发送方它需要放慢速度.

The receiver can't tell node to slow its reading, which would otherwise allow TCP's inbuilt flow control mechanisms to kick in and indicate to the sender that it needs to slow down.

首先,到目前为止我所描述的是否准确?有什么我遗漏的东西可以让节点避免这种情况吗?

Node Streams 备受推崇的功能之一是自动处理背压.

One of the much touted features of Node Streams is the automatic handling of backpressure.

AFAIK,可写流(tcp 套接字的)可以判断它是否需要减慢速度的唯一方法是查看 socket.bufferSize(指示写入到套接字但尚未发送).鉴于接收端的 Node 总是尽可能快地读取,这只能表明发送方和接收方之间的网络连接缓慢,而不是接收方是否跟不上.

AFAIK, the only way a writable stream (of a tcp socket) can tell if it needs to slow down or not is by looking at socket.bufferSize (indicating the amount of data written to the socket but not yet sent). Given that Node at the receiving end always reads as fast as it can, this can only indicate a slow network connection between sender and receiver, and NOT whether the receiver can't keep up.

那么其次,Node Streams 的自动背压能否在这种情况下以某种方式工作以处理跟不上的接收器?

这个问题似乎也影响了浏览器通过 websockets 接收数据,出于类似的原因,websockets API 没有提供一种机制来告诉浏览器减慢从 socket 读取数据的速度.

It also seems that this problem affects browsers receiving data via websockets, for the similar reason that the websockets API doesn't provide a mechanism to tell the browser to slow its reading from the socket.

Node(以及使用 websockets 的浏览器)在应用层实现手动流量控制机制,明确告诉发送过程放慢速度是解决这个问题的唯一方法吗?

推荐答案

为了回答您的第一个问题,我相信您的理解并不准确——至少在流之间传输数据时不准确.事实上,如果您阅读 pipe() 函数 的文档,您会看到它明确表示它会自动管理流,以便目的地不会被快速可读的流淹没."

To answer your first question, I believe your understanding is not accurate -- at least not when piping data between streams. In fact, if you read the documentation for the pipe() function you'll see that it explicitly says that it automatically manages the flow so that "destination is not overwhelmed by a fast readable stream."

pipe() 的底层实现会为您处理所有繁重的工作.输入流(Readable 流)将继续发出 data 事件直到输出流(a Writable 流)已满.顺便说一句,如果我没记错的话,当您尝试写入当前无法处理的数据时,流将返回 false.此时,管道将 pause() 可读流,这将阻止它从发出进一步的数据事件.因此,事件循环不会填满和耗尽您的内存,也不会发出简单丢失的事件.相反,可读将保持暂停,直到可写流发出 drain 事件.此时,管道将resume() 可读流.

The underlying implementation of pipe() is taking care of all of the heavy lifting for you. The input stream (a Readable stream) will continue to emit data events until the output stream (a Writable stream) is full. As an aside, if I remember correctly, the stream will return false when you attempt to write data that it cannot currently process. At this point, the pipe will pause() the Readable stream, which will prevent it from emitting further data events. Thus, the event loop isn't going to fill up and exhaust your memory nor is it going to emit events that are simply lost. Instead, the Readable will stay paused until the Writable stream emits a drain event. At that point, the pipe will resume() the Readable stream.

秘诀是将一个流输送到另一个流中,它会自动为您管理背压.这有望回答您的第二个问题,即 Node 可以并且确实通过简单的管道传输来自动管理它.

The secret sauce is piping one stream into another, which is managing the back pressure for you automatically. This hopefully answers your second question, which is that Node can and does automatically manage this by simply piping streams.

最后,真的没有必要手动实现这个(除非你从头开始编写一个新的流),因为它已经为你提供了.:)

And finally, there is really no need to implement this manually (unless you are writing a new stream from scratch) since it is already provided for you. :)

处理所有这些并不容易,正如宣布 的 Node 博客文章所承认的那样Node.js 中的流 2 API.这是一个很好的资源,当然比我在这里提供的信息要多得多.但是,您应该从文档 此处 以及向后兼容的原因:

Handling all of this is not easy, as admitted on the Node blog post that announced the streams2 API in Node. It's a great resource and certainly provides much more information than I could here. One little gotcha that isn't entirely obvious that you should know however, from the docs here and for backwards compatibility reasons:

如果您附加数据事件侦听器,那么它会将流切换到流动模式,并且数据将在可用时立即传递给您的处理程序.

If you attach a data event listener, then it will switch the stream into flowing mode, and data will be passed to your handler as soon as it is available.

因此请注意,附加数据事件侦听器以尝试观察流中的某些内容将从根本上将流改变为旧的做事方式.问我我是怎么知道的.

So just be aware that attaching the data event listener in an attempt to observe something in the stream will fundamentally alter the stream to the old way of doing things. Ask me how I know.

这篇关于基于 TCP 的 Node.JS 无界并发/流背压的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆