Node.js Streams如何工作? [英] How do Node.js Streams work?

查看:125
本文介绍了Node.js Streams如何工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个关于Node.js流的问题 - 特别是它们在概念上如何工作。

I have a question about Node.js streams - specifically how they work conceptually.

不缺乏有关如何使用流的文档。但是我很难找到数据流的工作方式。

There is no lack of documentation on how to use streams. But I've had difficulty finding how streams work at the data level.

我对网络通信HTTP的有限理解是来回发送完整的数据包。与订购公司目录的个人类似,客户端向服务器发送GET(目录)请求,服务器响应目录。浏览器不会收到目录的页面,而是整本书。

My limited understanding of web communication, HTTP, is that full "packages" of data are sent back and forth. Similar to an individual ordering a company's catalogue, a client sends a GET (catalogue) request to the server, and the server responds with the catalogue. The browser doesn't receive a page of the catalogue, but the whole book.

节点流可能是多部分消息吗?

Are node streams perhaps multipart messages?

我喜欢REST模型 - 特别是它是无状态的。浏览器和服务器之间的每次交互都是完全自包含且足够的。节点流因此不是RESTful吗?一位开发人员提到了与套接字管道的相似性,这使得连接保持打开状回到我的目录订购示例,这会像一个电视购物广告行等!还有更多!而不是完全包含的目录?

I like the REST model - especially that it is stateless. Every single interaction between the browser and server is completely self contained and sufficient. Are node streams therefore not RESTful? One developer mentioned the similarity with socket pipes, which keep the connection open. Back to my catalogue ordering example, would this be like an infomercial with the line "But wait! There's more!" instead of the fully contained catalogue?

大部分流是接收器下游发送暂停和暂停等消息的能力。上游'继续'。这些消息包括什么?他们是POST吗?

A large part of streams is the ability for the receiver 'down-stream' to send messages like 'pause' & 'continue' upstream. What do these messages consist of? Are they POST?

最后,我对Node如何工作的有限视觉理解包括这个事件循环。函数可以放在线程池的不同线程上,事件循环继续进行。但是不应该发送数据流保持事件循环被占用(即停止)直到流完成?它是如何继续监视下游的暂停请求?n事件循环是否将流放在池中的另一个线程上,当遇到暂停请求时,检索相关线程并暂停它?

Finally, my limited visual understanding of how Node works includes this event loop. Functions can be placed on separate threads from the thread pool, and the event loop carries on. But shouldn't sending a stream of data keep the event loop occupied (i.e. stopped) until the stream is complete? How is it ALSO keeping watch for the 'pause' request from downstream?n Does the event loop place the stream on another thread from the pool and when it encounters a 'pause' request, retrieve the relevant thread and pause it?

我已经阅读了node.js文档,完成了nodeschool教程,构建了一个heroku应用程序,购买了两本书(真实的,自包含的,书籍,有点像之前说过的目录和可能不喜欢节点流),在代码训练营中询问了几个节点教师 - 所有人都谈到如何使用流,但没有人谈论下面实际发生的事情。

I've read the node.js docs, completed the nodeschool tutorials, built a heroku app, purchased TWO books (real, self contained, books, kinda like the catalogues spoken before and likely not like node streams), asked several "node" instructors at code bootcamps - all speak about how to use streams but none speak about what's actually happening below.

也许你有没有一个很好的资源解释这些是如何工作的?对于非CS心灵来说,也许是一个很好的拟人类比?

Perhaps you have come across a good resource explaining how these work? Perhaps a good anthropomorphic analogy for a non CS mind?

推荐答案

首先要注意的是:node.js流不受限制到HTTP请求。 HTTP请求/网络资源只是node.js中流的一个示例。

The first thing to note is: node.js streams are not limited to HTTP requests. HTTP requests / Network resources are just one example of a stream in node.js.

Streams对于可以在小块中处理的所有内容都很有用。它们允许您以更小的块处理潜在的巨大资源,更容易适合您的RAM。

Streams are useful for everything that can be processed in small chunks. They allow you to process potentially huge resources in smaller chunks that fit into your RAM more easily.

假设您有一个文件(大小为几千兆字节)并想要转换所有文件将小写字母转换为大写字符并将结果写入另一个文件。天真的方法将使用 fs.readFile (为简洁省略了错误处理):

Say you have a file (several gigabytes in size) and want to convert all lowercase into uppercase characters and write the result to another file. The naive approach would read the whole file using fs.readFile (error handling omitted for brevity):

fs.readFile('my_huge_file', function (err, data) {
    var convertedData = data.toString().toUpperCase();

    fs.writeFile('my_converted_file', convertedData);
});

不幸的是,这个approch很容易压倒你的RAM,因为在处理之前必须存储整个文件。您还会浪费宝贵的时间等待文件被读取。以较小的块处理文件是否有意义?在等待硬盘提供剩余数据时,您可以在获得第一个字节后立即开始处理:

Unfortunately this approch will easily overwhelm your RAM as the whole file has to be stored before processing it. You would also waste precious time waiting for the file to be read. Wouldn't it make sense to process the file in smaller chunks? You could start processing as soon as you get the first bytes while waiting for the hard disk to provide the remaining data:

var readStream = fs.createReadStream('my_huge_file');
var writeStream = fs.createWriteStream('my_converted_file');
readStream.on('data', function (chunk) {
    var convertedChunk = chunk.toString().toUpperCase();
    writeStream.write(convertedChunk);
});
readStream.on('end', function () {
    writeStream.end();
});

这种方法要好得多:


  1. 您只会处理很容易放入RAM的小部分数据。

  2. 一旦第一个字节到达就开始处理,不要浪费时间做没什么,但等等。

打开流后,node.js会打开文件并开始读取。一旦操作系统将一些字节传递给正在读取文件的线程,它将被传递给您的应用程序。

Once you open the stream node.js will open the file and start reading from it. Once the operating system passes some bytes to the thread that's reading the file it will be passed along to your application.

返回HTTP流:


  1. 第一个问题在这里也是有效的。攻击者可能会向您发送大量数据以淹没您的RAM并取消(DoS)您的服务。

  2. 但在这种情况下,第二个问题更为重要:
    网络可能非常慢(想想智能手机),并且可能需要很长时间才能由客户端发送所有内容。通过使用流,您可以开始处理请求并缩短响应时间。






开暂停HTTP流:这不是在HTTP级别完成,而是更低。如果你暂停流node.js将停止从底层TCP套接字读取。
然后发生的事情取决于内核。它仍然可以缓冲传入的数据,因此在您完成当前工作后它就可以为您准备好了。 它还可能在TCP级别通知发件人它应该暂停发送数据。应用程序不需要处理。这不关他们的事。事实上,发件人应用程序可能甚至没有意识到你不再主动阅读!


On pausing the HTTP stream: This is not done at the HTTP level, but way lower. If you pause the stream node.js will simply stop reading from the underlying TCP socket. What is happening then is up to the kernel. It may still buffer the incoming data, so it's ready for you once you finished your current work. It may also inform the sender at the TCP level that it should pause sending data. Applications don't need to deal with that. That is none of their business. In fact the sender application probably does not even realize that you are no longer actively reading!

所以它基本上是在提供数据时尽快提供,但没有压倒性的你的资源。基础努力工作由操作系统完成(例如 net fs http )或您正在使用的流的作者(例如 zlib 这是转换流并通常用螺栓固定在 fs net )。

So it's basically about being provided data as soon as it is available, but without overwhelming your resources. The underlying hard work is done either by the operating system (e.g. net, fs, http) or by the author of the stream you are using (e.g. zlib which is a Transform stream and usually bolted onto fs or net).

这篇关于Node.js Streams如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆