NodeJS通过流复制文件非常缓慢 [英] NodeJS Copying File over a stream is very slow

查看:900
本文介绍了NodeJS通过流复制文件非常缓慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在VMWare下的SSD上用Node复制文件,但是性能很低。我已经运行测试实际速度的基准如下:

$ $ $ $ $ $ $ $ $ hdparm -tT / dev / sda

/ dev / sda:
定时缓存读取:在1.99秒中为12004 MB = 6025.64 MB / sec
定时缓存磁盘读取:在3.00秒中为1370 MB = 456.29 MB / sec

但是,以下用于复制文件的节点代码非常慢,因此后续运行并不会使速度更快:

  var fs = require(fs); 
fs.createReadStream(bigfile)。pipe(fs.createWriteStream(tempbigfile));

运行如下:

  $ seq 1 10000000> bigfile 
$ ll bigfile -h
-rw-rw-r-- 1 mustafa mustafa 848M Jun 3 03:30 bigfile
$ time node test.js

real 0m4.973s
user 0m2.621s
sys 0m7.236s
$时间节点test.js

真实0m5.370s
用户0m2.496s
sys 0m7.190s

这里有什么问题,如何加快速度?我相信只要调整缓冲区大小,就可以在C中快速写入它。让我感到困惑的是,当我写简单的几乎等价的pv程序的时候,把标准输出写到标准输出如下,这是非常快的。

  process.stdin.pipe(process.stdout); 

运行如下:

  $ dd if = / dev / zero bs = 8M count = 128 | pv | dd = / dev / null 
128 + 0记录为174MB / s] [=> ]
128 + 0记录
复制1073741824字节(1.1 GB),5.78077秒,186 MB / s
1GB 0:00:05 [177MB / s] [= ]
2097152 + 0记录
2097152 + 0记录
1073741824字节(1.1 GB)复制,5.78131 s,186 MB / s
$ dd if = / dev / zero bs = 8M count = 128 | dd of = / dev / null
128 + 0记录
128 + 0记录
复制1073741824字节(1.1 GB),5.57005 s,193 MB / s
2097152+ 0记录
2097152 + 0记录
1073741824字节(1.1 GB)复制,5.5704 s,193 MB / s
$ dd if = / dev / zero bs = 8M count = 128 |节点test.js | dd = / dev / null
128 + 0记录
128 + 0记录
复制1073741824字节(1.1 GB),4.61734 s,233 MB / s
2097152+ 0记录
2097152 + 0记录
1073741824字节(1.1 GB)复制,4.62766 s,232 MB / s
$ dd if = / dev / zero bs = 8M count = 128 |节点test.js | dd = / dev / null
128 + 0记录
128 + 0记录
1073741824字节(1.1 GB)复制,4.22107 s,254 MB / s
2097152+ 0记录
2097152 + 0记录
1073741824字节(1.1 GB)复制,4.23231 s,254 MB / s
$ dd if = / dev / zero bs = 8M count = 128 | dd = / dev / null
128 + 0记录
128 + 0记录
1073741824字节(1.1 GB)复制,5.70124 s,188 MB / s
2097152+ 0记录
2097152 + 0记录
1073741824字节(1.1 GB)复制,5.70144 s,188 MB / s
$ dd if = / dev / zero bs = 8M count = 128 |节点test.js | dd = / dev / null
128 + 0记录
128 + 0记录
1073741824字节(1.1 GB)复制,4.51055 s,238 MB / s
2097152+ 0记录
2097152 + 0记录
1073741824字节(1.1 GB)复制,4.52087 s,238 MB / s


解决方案

我不知道你的问题的答案,但也许这有助于你调查问题。

在关于数据流的Node.js文档中,在 Hood :缓冲它说:


可写入和可读取的流将缓冲数据在名为_writableState的内部
对象上。缓冲区或_readableState.buffer,


可能被缓冲的数据量取决于
highWaterMark选项,

[/ b]

流的目的,特别是使用pipe()方法的目的是将
的数据缓冲限制在可接受的水平,所以不同速度的源和
目的地不会压倒可用内存。


所以,您可以使用缓冲区大小来提高速度:

  var fs = require('fs'); 
var path = require('path');
var from = path.normalize(process.argv [2]);
var to = path.normalize(process.argv [3]);

var readOpts = {highWaterMark:Math.pow(2,16)};
var writeOpts = {highWaterMark:Math.pow(2,16)};

var source = fs.createReadStream(from,readOpts);
var destiny = fs.createWriteStream(to,writeOpts)

source.pipe(destiny);


I am copying file with Node on an SSD under VMWare, but the performance is very low. The benchmark I have run to measure actual speed is as follows:

$ hdparm -tT /dev/sda

/dev/sda:
 Timing cached reads:   12004 MB in  1.99 seconds = 6025.64 MB/sec
 Timing buffered disk reads: 1370 MB in  3.00 seconds = 456.29 MB/sec

However, the following Node code that copies file is very slow, evne teh consequent runs do not make it faster:

var fs  = require("fs");
fs.createReadStream("bigfile").pipe(fs.createWriteStream("tempbigfile"));

And the runs as:

$ seq 1 10000000 > bigfile
$ ll bigfile -h
-rw-rw-r-- 1 mustafa mustafa 848M Jun  3 03:30 bigfile
$ time node test.js 

real    0m4.973s
user    0m2.621s
sys     0m7.236s
$ time node test.js 

real    0m5.370s
user    0m2.496s
sys     0m7.190s

What is the issue here and how can I speed it up? I believe I can write it faster in C by just adjusting the buffer size. The thing that confuses me is that when I wrote simple almost pv equivalent program, that pipes stdin to stdout as the below, it is very fast.

process.stdin.pipe(process.stdout);

And the runs as:

$ dd if=/dev/zero bs=8M count=128 | pv | dd of=/dev/null
128+0 records in 174MB/s] [        <=>                                                                                ]
128+0 records out
1073741824 bytes (1.1 GB) copied, 5.78077 s, 186 MB/s
   1GB 0:00:05 [ 177MB/s] [          <=>                                                                              ]
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 5.78131 s, 186 MB/s
$ dd if=/dev/zero bs=8M count=128 |  dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 5.57005 s, 193 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 5.5704 s, 193 MB/s
$ dd if=/dev/zero bs=8M count=128 | node test.js | dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 4.61734 s, 233 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 4.62766 s, 232 MB/s
$ dd if=/dev/zero bs=8M count=128 | node test.js | dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 4.22107 s, 254 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 4.23231 s, 254 MB/s
$ dd if=/dev/zero bs=8M count=128 | dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 5.70124 s, 188 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 5.70144 s, 188 MB/s
$ dd if=/dev/zero bs=8M count=128 | node test.js | dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 4.51055 s, 238 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 4.52087 s, 238 MB/s

解决方案

I don't know the answer to your question, but perhaps this helps in your investigation of the problem.

In the Node.js documentation about streams, under Streams Under the Hood: Buffering it says:

Both Writable and Readable streams will buffer data on an internal object called _writableState.buffer or _readableState.buffer, respectively.

The amount of data that will potentially be buffered depends on the highWaterMark option which is passed into the constructor.

[...]

The purpose of streams, especially with the pipe() method, is to limit the buffering of data to acceptable levels, so that sources and destinations of varying speed will not overwhelm the available memory.

So, you can play with the buffer sizes to improve speed:

var fs = require('fs');
var path = require('path');
var from = path.normalize(process.argv[2]);
var to = path.normalize(process.argv[3]);

var readOpts = {highWaterMark: Math.pow(2,16)};
var writeOpts = {highWaterMark: Math.pow(2,16)};

var source = fs.createReadStream(from, readOpts);
var destiny = fs.createWriteStream(to, writeOpts)

source.pipe(destiny);

这篇关于NodeJS通过流复制文件非常缓慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆