Node.js:如何将流读入缓冲区? [英] Node.js: How to read a stream into a buffer?

查看:143
本文介绍了Node.js:如何将流读入缓冲区?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个非常简单的函数,可以从给定的URL下载图像,调整图像大小并上传到S3(使用'gm'和'knox'),我不知道是否要读取流以正确的缓冲区. (一切正常,但这是正确的方法吗?)

I wrote a pretty simple function that downloads an image from a given URL, resize it and upload to S3 (using 'gm' and 'knox'), I have no idea if I'm doing the reading of a stream to a buffer correctly. (everything is working, but is it the correct way?)

另外,我想了解一些有关事件循环的知识,我怎么知道该函数的一次调用不会泄漏任何内容,也不会将'buf'变量更改为另一个已经运行的调用(否则这种情况是不可能的,因为回调是匿名函数吗?)

also, I want to understand something about the event loop, how do I know that one invocation of the function won't leak anything or change the 'buf' variable to another already running invocation (or this scenario is impossible because the callbacks are anonymous functions?)

var http = require('http');
var https = require('https');
var s3 = require('./s3');
var gm = require('gm');

module.exports.processImageUrl = function(imageUrl, filename, callback) {
var client = http;
if (imageUrl.substr(0, 5) == 'https') { client = https; }

client.get(imageUrl, function(res) {
    if (res.statusCode != 200) {
        return callback(new Error('HTTP Response code ' + res.statusCode));
    }

    gm(res)
        .geometry(1024, 768, '>')
        .stream('jpg', function(err, stdout, stderr) {
            if (!err) {
                var buf = new Buffer(0);
                stdout.on('data', function(d) {
                    buf = Buffer.concat([buf, d]);
                });

                stdout.on('end', function() {
                    var headers = {
                        'Content-Length': buf.length
                        , 'Content-Type': 'Image/jpeg'
                        , 'x-amz-acl': 'public-read'
                    };

                    s3.putBuffer(buf, '/img/d/' + filename + '.jpg', headers, function(err, res) {
                        if(err) {
                            return callback(err);
                        } else {
                            return callback(null, res.client._httpMessage.url);
                        }
                    });
                });
            } else {
                callback(err);
            }
        });
    }).on('error', function(err) {
        callback(err);
    });
};

推荐答案

总体上,我看不到任何会破坏您的代码的东西.

Overall I don't see anything that would break in your code.

两个建议:

组合Buffer对象的方式不是最理想的,因为它必须在每个数据"事件上复制所有预先存在的数据.最好将这些块放入一个数组中,并全部concat放在最后.

The way you are combining Buffer objects is a suboptimal because it has to copy all the pre-existing data on every 'data' event. It would be better to put the chunks in an array and concat them all at the end.

var bufs = [];
stdout.on('data', function(d){ bufs.push(d); });
stdout.on('end', function(){
  var buf = Buffer.concat(bufs);
}

为了提高性能,我将调查您使用的S3库是否支持流.理想情况下,您根本不需要创建一个大缓冲区,而只需将stdout流直接传递到S3库.

For performance, I would look into if the S3 library you are using supports streams. Ideally you wouldn't need to create one large buffer at all, and instead just pass the stdout stream directly to the S3 library.

对于问题的第二部分,这是不可能的.调用函数时,将为其分配自己的私有上下文,并且只能从该函数内定义的其他项访问该函数内定义的所有内容.

As for the second part of your question, that isn't possible. When a function is called, it is allocated its own private context, and everything defined inside of that will only be accessible from other items defined inside that function.

将文件转储到文件系统可能会减少每个请求的内存使用量,但是文件IO可能很慢,因此可能不值得.我想说您不应该优化太多,除非您可以对此功能进行概要分析和压力测试.如果垃圾收集器正在执行其工作,则可能是过度优化了.

Dumping the file to the filesystem would probably mean less memory usage per request, but file IO can be pretty slow so it might not be worth it. I'd say that you shouldn't optimize too much until you can profile and stress-test this function. If the garbage collector is doing its job you may be overoptimizing.

说了这么多,总有更好的方法,所以不要使用文件.由于所需的只是长度,因此您可以计算该长度而无需将所有缓冲区附加在一起,因此根本不需要分配新的缓冲区.

With all that said, there are better ways anyway, so don't use files. Since all you want is the length, you can calculate that without needing to append all of the buffers together, so then you don't need to allocate a new Buffer at all.

var pause_stream = require('pause-stream');

// Your other code.

var bufs = [];
stdout.on('data', function(d){ bufs.push(d); });
stdout.on('end', function(){
  var contentLength = bufs.reduce(function(sum, buf){
    return sum + buf.length;
  }, 0);

  // Create a stream that will emit your chunks when resumed.
  var stream = pause_stream();
  stream.pause();
  while (bufs.length) stream.write(bufs.shift());
  stream.end();

  var headers = {
      'Content-Length': contentLength,
      // ...
  };

  s3.putStream(stream, ....);

这篇关于Node.js:如何将流读入缓冲区?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆