我如何为node.js设计和实现非阻塞内存映射模块 [英] How would I design and implement a non-blocking memory mapping module for node.js
问题描述
存在用于node.js的mmap模块: https://github.com/bnoordhuis/node-mmap/
正如作者Ben Noordhuis指出的那样,访问映射内存可能会阻塞,这就是为什么他不再推荐并停止使用它的原因.
所以我想知道如何为node.js设计一个非阻塞内存映射模块?螺纹,纤维,?
显然,这附近引发了一个问题,即是否会在其他地方而不是在请求处理程序中发生node.js中的线程.
在谈论以非阻塞方式实现某些本机功能时,首先要看的是工作队列 API.
如果我们快速浏览 node-mmap的源代码,我们看到它实际上非常简单.它调用mmap
并返回一个节点Buffer
,该节点包裹了映射的内存区域.
从此Buffer
中读取内容是导致OS执行I/O的结果.因为这一定会在JS线程上发生,所以我们最终用磁盘I/O阻塞了JS线程.
您应该使用C ++编写包装类,封送通过工作队列进行的读写操作,而不是返回允许JS直接访问映射的内存的Buffer
.这样,磁盘I/O将在单独的线程上发生.
在JS中,您将使用以下方式:
fs.open('/path/to/file', 'r', function(err, fd) {
fs.fstat(fd, function(err, stats) {
var mapped = mmap.map(stats.size, mmap.PROT_READ, mmap.MAP_SHARED, fd, 0);
mapped.read(start, len, function(err, data) {
// ...
});
});
});
在C语言中,read
函数将创建一个libuv工作请求并将其排队在工作队列中.然后,C worker函数将读取映射的内存范围(基于调用者的规范),这可能会导致磁盘I/O,但这是安全的,因为它发生在单独的线程上.
接下来发生的事情很有趣.对于工作者,安全的方法是从映射的内存中alloc
新的内存块和memcpy
.然后,工作程序将指针传递给副本,然后C回调将其包装在Buffer
中以返回到JS-land.
您还可以尝试读取范围(以便在工作线程上发生任何必要的I/O),但实际上不对数据做任何事情,然后让C回调将映射的内存范围简单地包装在
最后,我怀疑这是否会比节点的常规fs
方法提供任何额外的性能.如果我正在做确实可以使用mmap
进行证明的事情,我只会走这条路.
There exists the mmap module for node.js: https://github.com/bnoordhuis/node-mmap/
As the author Ben Noordhuis notes, accesing mapped memory can block, which is why he does not recommend it anymore and discontinued it.
So I wonder how would I design a non-blocking memory mapping module for node.js? Threading, Fibers, ?
Obviously this nearby raises the question if threading in node.js would just happen elsewhere instead of the request handler.
When talking about implementing some native facility in a non-blocking fashion, the first place to look is libuv. It is how node's core modules interface with the underlying platform. Of particular interest is the work queue API.
If we take a quick look at node-mmap's source, we see that it's actually extremely simple. It calls mmap
and returns a node Buffer
that wraps the mapped memory region.
Reading from this Buffer
is what results in the OS performing I/O. Because that will necessarily happen on the JS thread, we end up blocking the JS thread with disk I/O.
Instead of returning a Buffer
that allows JS direct access to the mapped memory, you should write a wrapper class in C++ that marshals reads and writes through the work queue. In this way, the disk I/O will happen on a separate thread.
In JS, you'd use it something like this:
fs.open('/path/to/file', 'r', function(err, fd) {
fs.fstat(fd, function(err, stats) {
var mapped = mmap.map(stats.size, mmap.PROT_READ, mmap.MAP_SHARED, fd, 0);
mapped.read(start, len, function(err, data) {
// ...
});
});
});
And in C, the read
function would create a libuv work request and queue it in the work queue. The C worker function would then read the mapped memory range (based on the caller's specifications), which may cause disk I/O, but this is safe because it is happening on a separate thread.
What happens next is interesting. The safe approach would be for the worker to alloc
a new chunk of memory and memcpy
from the mapped memory. The worker then passes a pointer to the copy, and the C callback wraps it up in a Buffer
to be returned to JS-land.
You could also try reading over the range (so that any necessary I/O happens on the worker thread) but not actually doing anything with the data, and then having the C callback simply wrap the mapped memory range in a Buffer
. In theory, the parts of the file that the worker read would stay in RAM, so access to that portion of mapped memory would not block. However, I honestly don't know enough about mapped memory to say whether this might end up biting you.
Finally, I'm dubious about whether this will actually provide any extra performance over node's regular fs
methods. I would only go down this road if I was doing something that really justifies using mmap
.
这篇关于我如何为node.js设计和实现非阻塞内存映射模块的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!