Node.js 流如何工作? [英] How do Node.js Streams work?

查看:26
本文介绍了Node.js 流如何工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个关于 Node.js 流的问题 - 特别是它们在概念上是如何工作的.

I have a question about Node.js streams - specifically how they work conceptually.

不乏关于如何使用流的文档.但是我很难找到流在数据级别的工作方式.

There is no lack of documentation on how to use streams. But I've had difficulty finding how streams work at the data level.

我对 Web 通信 HTTP 的有限理解是来回发送完整的数据包".类似于个人订购公司的目录,客户端向服务器发送 GET(目录)请求,服务器以目录作为响应.浏览器接收的不是目录的一页,而是整本书.

My limited understanding of web communication, HTTP, is that full "packages" of data are sent back and forth. Similar to an individual ordering a company's catalogue, a client sends a GET (catalogue) request to the server, and the server responds with the catalogue. The browser doesn't receive a page of the catalogue, but the whole book.

节点流可能是多部分消息吗?

Are node streams perhaps multipart messages?

我喜欢 REST 模型——尤其是它是无状态的.浏览器和服务器之间的每一次交互都是完全独立且足够的.因此,节点流不是 RESTful 吗?一位开发人员提到了与保持连接打开的套接字管道的相似性.回到我的目录订购示例,这会不会像一个带有但是等等!还有更多!"的商业广告.而不是完整的目录?

I like the REST model - especially that it is stateless. Every single interaction between the browser and server is completely self contained and sufficient. Are node streams therefore not RESTful? One developer mentioned the similarity with socket pipes, which keep the connection open. Back to my catalogue ordering example, would this be like an infomercial with the line "But wait! There's more!" instead of the fully contained catalogue?

流的很大一部分是接收器下游"发送消息的能力,如暂停"&继续"上游.这些消息由什么组成?他们在 POST 吗?

A large part of streams is the ability for the receiver 'down-stream' to send messages like 'pause' & 'continue' upstream. What do these messages consist of? Are they POST?

最后,我对 Node 工作原理的有限视觉理解包括这个事件循环.函数可以放在与线程池不同的线程上,并且事件循环继续进行.但是不应该发送数据流来保持事件循环被占用(即停止)直到流完成?它还如何监视来自下游的暂停"请求?n 事件循环是否将流放置在池中的另一个线程上,当它遇到暂停"请求时,检索相关线程并暂停它?

Finally, my limited visual understanding of how Node works includes this event loop. Functions can be placed on separate threads from the thread pool, and the event loop carries on. But shouldn't sending a stream of data keep the event loop occupied (i.e. stopped) until the stream is complete? How is it ALSO keeping watch for the 'pause' request from downstream?n Does the event loop place the stream on another thread from the pool and when it encounters a 'pause' request, retrieve the relevant thread and pause it?

我已经阅读了 node.js 文档,完成了 nodeschool 教程,构建了一个 heroku 应用程序,购买了两本书(真实的,自包含的,书籍,有点像之前所说的目录,可能不像节点流),问了几个代码训练营中的节点"讲师 - 都在谈论如何使用流,但没有人谈论下面实际发生的事情.

I've read the node.js docs, completed the nodeschool tutorials, built a heroku app, purchased TWO books (real, self contained, books, kinda like the catalogues spoken before and likely not like node streams), asked several "node" instructors at code bootcamps - all speak about how to use streams but none speak about what's actually happening below.

也许您找到了一个很好的资源来解释这些是如何工作的?对于非 CS 头脑来说,也许是一个很好的拟人化类比?

Perhaps you have come across a good resource explaining how these work? Perhaps a good anthropomorphic analogy for a non CS mind?

推荐答案

首先要注意的是:node.js 流不仅限于 HTTP 请求.HTTP 请求/网络资源只是 node.js 中流的一个示例.

The first thing to note is: node.js streams are not limited to HTTP requests. HTTP requests / Network resources are just one example of a stream in node.js.

流对于可以小块处理的所有内容都很有用.它们让您可以更轻松地以更轻松地放入 RAM 的较小块处理潜在的巨大资源.

Streams are useful for everything that can be processed in small chunks. They allow you to process potentially huge resources in smaller chunks that fit into your RAM more easily.

假设您有一个文件(大小为几 GB)并且想要将所有小写字符转换为大写字符并将结果写入另一个文件.天真的方法将使用 fs.readFile 读取整个文件(为简洁起见省略了错误处理):

Say you have a file (several gigabytes in size) and want to convert all lowercase into uppercase characters and write the result to another file. The naive approach would read the whole file using fs.readFile (error handling omitted for brevity):

fs.readFile('my_huge_file', function (err, data) {
    var convertedData = data.toString().toUpperCase();

    fs.writeFile('my_converted_file', convertedData);
});

不幸的是,这种方法很容易使您的 RAM 不堪重负,因为在处理之前必须存储整个文件.您还会浪费宝贵的时间等待文件被读取.以较小的块处理文件是否有意义?您可以在等待硬盘提供剩余数据的同时获得第一个字节后立即开始处理:

Unfortunately this approch will easily overwhelm your RAM as the whole file has to be stored before processing it. You would also waste precious time waiting for the file to be read. Wouldn't it make sense to process the file in smaller chunks? You could start processing as soon as you get the first bytes while waiting for the hard disk to provide the remaining data:

var readStream = fs.createReadStream('my_huge_file');
var writeStream = fs.createWriteStream('my_converted_file');
readStream.on('data', function (chunk) {
    var convertedChunk = chunk.toString().toUpperCase();
    writeStream.write(convertedChunk);
});
readStream.on('end', function () {
    writeStream.end();
});

这种方法要好得多:

  1. 您将只处理可以轻松放入 RAM 的一小部分数据.
  2. 一旦第一个字节到达,您就开始处理,不要浪费时间什么都不做,而是等待.

打开流后,node.js 将打开文件并开始读取.一旦操作系统将一些字节传递给正在读取文件的线程,它将被传递给您的应用程序.

Once you open the stream node.js will open the file and start reading from it. Once the operating system passes some bytes to the thread that's reading the file it will be passed along to your application.

回到 HTTP 流:

  1. 第一个问题在这里也有效.攻击者可能会向您发送大量数据以压倒您的 RAM 并关闭 (DoS) 您的服务.
  2. 然而,在这种情况下,第二个问题更为重要:网络可能很慢(想想智能手机),并且客户端发送所有内容可能需要很长时间.通过使用流,您可以开始处理请求并缩短响应时间.

<小时>

暂停 HTTP 流:这不是在 HTTP 级别完成的,而是更低的级别.如果您暂停流 node.js 将简单地停止从底层 TCP 套接字读取.然后发生的事情取决于内核.它可能仍会缓冲传入的数据,因此一旦您完成当前的工作,它就可以为您准备好.它也可能在 TCP 级别通知发送方它应该暂停发送数据.应用程序不需要处理那个.那不关他们的事.事实上,发件人应用程序可能甚至没有意识到您不再积极阅读!


On pausing the HTTP stream: This is not done at the HTTP level, but way lower. If you pause the stream node.js will simply stop reading from the underlying TCP socket. What is happening then is up to the kernel. It may still buffer the incoming data, so it's ready for you once you finished your current work. It may also inform the sender at the TCP level that it should pause sending data. Applications don't need to deal with that. That is none of their business. In fact the sender application probably does not even realize that you are no longer actively reading!

因此,基本上是在数据可用时立即提供数据,但不会占用您的资源.底层的艰苦工作由操作系统(例如 netfshttp)或您正在使用的流的作者完成(例如,zlib 是一个 Transform 流,通常固定在 fsnet 上).

So it's basically about being provided data as soon as it is available, but without overwhelming your resources. The underlying hard work is done either by the operating system (e.g. net, fs, http) or by the author of the stream you are using (e.g. zlib which is a Transform stream and usually bolted onto fs or net).

这篇关于Node.js 流如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆