Meteor:从客户端上传文件到 Mongo 集合 vs 文件系统 vs GridFS [英] Meteor: uploading file from client to Mongo collection vs file system vs GridFS

查看:15
本文介绍了Meteor:从客户端上传文件到 Mongo 集合 vs 文件系统 vs GridFS的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Meteor 很棒,但它缺乏对传统文件上传的原生支持.有几个选项可以处理文件上传:

Meteor is great but it lacks native supports for traditional file uploading. There are several options to handle file uploading:

从客户端,可以使用以下方式发送数据:

From the client, data can be sent using:

  • Meteor.call('saveFile',data) 或 collection.insert({file:data})
  • 'POST' 表单或 HTTP.call('POST')

在服务器,文件可以保存到:

  • 由 collection.insert({file:data}) 收集的 mongodb 文件
  • /path/to/dir 中的文件系统
  • mongodb GridFS

这些方法的优缺点是什么以及如何最好地实现它们?我知道还有其他选项,例如保存到第三方网站并获取网址.

What are the pros and cons for these methods and how best to implement them? I am aware that there are also other options such as saving to a third party site and obtain an url.

推荐答案

使用 Meteor 即可实现文件上传,无需再使用任何包或第三方

You can achieve file uploading with Meteor without using any more packages or a third party

/*** client.js ***/

// asign a change event into input tag
'change input' : function(event,template){ 
    var file = event.target.files[0]; //assuming 1 file only
    if (!file) return;

    var reader = new FileReader(); //create a reader according to HTML5 File API

    reader.onload = function(event){          
      var buffer = new Uint8Array(reader.result) // convert to binary
      Meteor.call('saveFile', buffer);
    }

    reader.readAsArrayBuffer(file); //read the file as arraybuffer
}

/*** server.js ***/ 

Files = new Mongo.Collection('files');

Meteor.methods({
    'saveFile': function(buffer){
        Files.insert({data:buffer})         
    }   
});

说明

首先,使用 HTML5 File API 从输入中抓取文件.使用新的 FileReader 创建阅读器.该文件被读取为 readAsArrayBuffer.这个arraybuffer,如果你console.log,返回{},而DDP不能通过线路发送它,所以它必须转换为Uint8Array.

First, the file is grabbed from the input using HTML5 File API. A reader is created using new FileReader. The file is read as readAsArrayBuffer. This arraybuffer, if you console.log, returns {} and DDP can't send this over the wire, so it has to be converted to Uint8Array.

当你把它放在 Meteor.call 中时,Meteor 会自动运行 EJSON.stringify(Uint8Array) 并用 DDP 发送它.可以查看chrome控制台websocket流量中的数据,会看到类似base64的字符串

When you put this in Meteor.call, Meteor automatically runs EJSON.stringify(Uint8Array) and sends it with DDP. You can check the data in chrome console websocket traffic, you will see a string resembling base64

在服务器端,Meteor 调用 EJSON.parse() 并将其转换回缓冲区

On the server side, Meteor call EJSON.parse() and converts it back to buffer

优点

  1. 简单,没有hacky的方式,没有额外的包
  2. 坚持线上数据原则

缺点

  1. 更多带宽:生成的 base64 字符串比原始文件大 ~ 33%
  2. 文件大小限制:不能发送大文件(限制 ~ 16 MB?)
  3. 无缓存
  4. 还没有 gzip 或压缩
  5. 发布文件会占用大量内存

<小时>

选项 2:XHR,从客户端发布到文件系统

/*** client.js ***/

// asign a change event into input tag
'change input' : function(event,template){ 
    var file = event.target.files[0]; 
    if (!file) return;      

    var xhr = new XMLHttpRequest(); 
    xhr.open('POST', '/uploadSomeWhere', true);
    xhr.onload = function(event){...}

    xhr.send(file); 
}

/*** server.js ***/ 

var fs = Npm.require('fs');

//using interal webapp or iron:router
WebApp.connectHandlers.use('/uploadSomeWhere',function(req,res){
    //var start = Date.now()        
    var file = fs.createWriteStream('/path/to/dir/filename'); 

    file.on('error',function(error){...});
    file.on('finish',function(){
        res.writeHead(...) 
        res.end(); //end the respone 
        //console.log('Finish uploading, time taken: ' + Date.now() - start);
    });

    req.pipe(file); //pipe the request to the file
});

说明

客户端中的文件被抓取,一个 XHR 对象被创建,文件通过POST"发送到服务器.

The file in the client is grabbed, an XHR object is created and the file is sent via 'POST' to the server.

在服务器上,数据通过管道传输到底层文件系统.您还可以在保存之前确定文件名、执行清理或检查它是否已经存在等.

On the server, the data is piped into an underlying file system. You can additionally determine the filename, perform sanitisation or check if it exists already etc before saving.

优点

  1. 利用 XHR 2,您可以发送数组缓冲区,与选项 1 相比,不需要新的 FileReader()
  2. Arraybuffer 与 base64 字符串相比体积更小
  3. 没有大小限制,我在本地主机中发送了一个 ~ 200 MB 的文件,没有问题
  4. 文件系统比 mongodb 快(稍后在下面的基准测试中会详细介绍)
  5. 可缓存和 gzip

缺点

  1. XHR 2 在旧浏览器中不可用,例如低于 IE10,但当然你可以实现传统的 post <form>我只使用了 xhr = new XMLHttpRequest(),而不是 HTTP.call('POST'),因为 Meteor 中当前的 HTTP.call 还不能发送 arraybuffer(如果我错了,请指出).
  2. /path/to/dir/必须在meteor 之外,否则在/public 中写入文件会触发重新加载

<小时>

选项 3:XHR,保存到 GridFS

/*** client.js ***/

//same as option 2


/*** version A: server.js ***/  

var db = MongoInternals.defaultRemoteCollectionDriver().mongo.db;
var GridStore = MongoInternals.NpmModule.GridStore;

WebApp.connectHandlers.use('/uploadSomeWhere',function(req,res){
    //var start = Date.now()        
    var file = new GridStore(db,'filename','w');

    file.open(function(error,gs){
        file.stream(true); //true will close the file automatically once piping finishes

        file.on('error',function(e){...});
        file.on('end',function(){
            res.end(); //send end respone
            //console.log('Finish uploading, time taken: ' + Date.now() - start);
        });

        req.pipe(file);
    });     
});

/*** version B: server.js ***/  

var db = MongoInternals.defaultRemoteCollectionDriver().mongo.db;
var GridStore = Npm.require('mongodb').GridStore; //also need to add Npm.depends({mongodb:'2.0.13'}) in package.js

WebApp.connectHandlers.use('/uploadSomeWhere',function(req,res){
    //var start = Date.now()        
    var file = new GridStore(db,'filename','w').stream(true); //start the stream 

    file.on('error',function(e){...});
    file.on('end',function(){
        res.end(); //send end respone
        //console.log('Finish uploading, time taken: ' + Date.now() - start);
    });
    req.pipe(file);
});     

说明

客户端脚本与选项 2 中的相同.

The client script is the same as in option 2.

根据 Meteor 1.0.x mongo_driver.js 最后一行,暴露了一个名为 MongoInternals 的全局对象,您可以调用 defaultRemoteCollectionDriver() 返回 GridStore 所需的当前数据库 db 对象.在版本 A 中,GridStore 也由 MongoInternals 公开.目前meteor使用的mongo是v1.4.x

According to Meteor 1.0.x mongo_driver.js last line, a global object called MongoInternals is exposed, you can call defaultRemoteCollectionDriver() to return the current database db object which is required for the GridStore. In version A, the GridStore is also exposed by the MongoInternals. The mongo used by current meteor is v1.4.x

然后在路由内部,你可以通过调用 var file = new GridStore(...) (API).然后打开文件并创建一个流.

Then inside a route, you can create a new write object by calling var file = new GridStore(...) (API). You then open the file and create a stream.

我还包含了一个版本 B.在这个版本中,GridStore 是通过 Npm.require('mongodb') 使用新的 mongodb 驱动器调用的,在撰写本文时,该 mongo 是最新的 v2.0.13.新的 API 不需要您打开文件,你可以直接调用stream(true)并开始管道

I also included a version B. In this version, the GridStore is called using a new mongodb drive via Npm.require('mongodb'), this mongo is the latest v2.0.13 as of this writing. The new API doesn't require you to open the file, you can call stream(true) directly and start piping

优点

  1. 与选项 2 相同,使用数组缓冲区发送,与选项 1 中的 base64 字符串相比开销更少
  2. 无需担心文件名清理
  3. 与文件系统分离,无需写入临时目录,数据库可以备份,rep,shard等
  4. 无需实现任何其他包
  5. 可缓存且可以压缩
  6. 与普通的 mongo 系列相比,可以存储更大的尺寸
  7. 使用管道减少内存过载

缺点

  1. 不稳定的 Mongo GridFS.我包括了版本 A (mongo 1.x) 和 B (mongo 2.x).在版本 A 中,当管道大于 10 MB 的大文件时,出现很多错误,包括文件损坏、管道未完成.这个问题在B版使用mongo 2.x解决了,希望meteor能尽快升级到mongodb 2.x
  2. API 混乱.在版本 A 中,您需要先打开文件才能进行流式传输,但在版本 B 中,您无需调用 open 即可进行流式传输.API 文档也不是很清楚,流不是 100% 语法可与 Npm.require('fs') 交换.在 fs 中,您调用 file.on('finish') 但在 GridFS 中,您在写入完成/结束时调用 file.on('end').
  3. GridFS 不提供写原子性,所以如果有多个并发写入同一个文件,最终的结果可能会有很大的不同
  4. 速度.Mongo GridFS 比文件系统慢得多.
  1. Unstable Mongo GridFS. I included version A (mongo 1.x) and B (mongo 2.x). In version A, when piping large files > 10 MB, I got lots of error, including corrupted file, unfinished pipe. This problem is solved in version B using mongo 2.x, hopefully meteor will upgrade to mongodb 2.x soon
  2. API confusion. In version A, you need to open the file before you can stream, but in version B, you can stream without calling open. The API doc is also not very clear and the stream is not 100% syntax exchangeable with Npm.require('fs'). In fs, you call file.on('finish') but in GridFS you call file.on('end') when writing finishes/ends.
  3. GridFS doesn't provide write atomicity, so if there are multiple concurrent writes to the same file, the final result may be very different
  4. Speed. Mongo GridFS is much slower than file system.

基准您可以在选项 2 和选项 3 中看到,我包含了 var start = Date.now() 并且在写入 end 时,我使用 console.log 以 ms 为单位注销时间,以下是结果.双核,4 GB 内存,硬盘,基于 ubuntu 14.04.

Benchmark You can see in option 2 and option 3, I included var start = Date.now() and when writing end, I console.log out the time in ms, below is the result. Dual Core, 4 GB ram, HDD, ubuntu 14.04 based.

file size   GridFS  FS
100 KB      50      2
1 MB        400     30
10 MB       3500    100
200 MB      80000   1240

您可以看到 FS 比 GridFS 快得多.对于 200 MB 的文件,使用 GridFS 需要约 80 秒,但在 FS 中仅需要约 1 秒.SSD我没试过,结果可能不一样.然而,在现实生活中,带宽可能会决定文件从客户端传输到服务器的速度,达到 200 MB/秒的传输速度并不常见.另一方面,传输速度约 2 MB/秒 (GridFS) 更正常.

You can see that FS is much faster than GridFS. For a file of 200 MB, it takes ~80 sec using GridFS but only ~ 1 sec in FS. I haven't tried SSD, the result may be different. However, in real life, the bandwidth may dictate how fast the file is streamed from client to server, achieving 200 MB/sec transfer speed is not typical. On the other hand, a transfer speed ~2 MB/sec (GridFS) is more the norm.

结论

这绝不是全面的,但您可以决定哪个选项最适合您的需要.

By no mean this is comprehensive, but you can decide which option is best for your need.

  • DDP 是最简单的,遵循 Meteor 的核心原则,但数据更庞大,传输过程中不可压缩,不可缓存.但如果您只需要小文件,此选项可能会很好.
  • XHR 结合文件系统 是传统"方式.稳定的 API、快速、流式"、可压缩、可缓存(ETag 等),但需要位于单独的文件夹中
  • XHR 与 GridFS 结合,您可以获得重复集、可扩展、不接触文件系统目录、大文件和许多文件(如果文件系统限制数量)的好处,也可以缓存可压缩.但是,API 不稳定,多次写入会出错,这是 s..l..o..w..
  • DDP is the simplest and sticks to the core Meteor principle but the data are more bulky, not compressible during transfer, not cachable. But this option may be good if you only need small files.
  • XHR coupled with file system is the 'traditional' way. Stable API, fast, 'streamable', compressible, cachable (ETag etc), but needs to be in a separate folder
  • XHR coupled with GridFS, you get the benefit of rep set, scalable, no touching file system dir, large files and many files if file system restricts the numbers, also cachable compressible. However, the API is unstable, you get errors in multiple writes, it's s..l..o..w..

希望很快,meteor DDP 可以支持 gzip、缓存等,GridFS 可以更快...

Hopefully soon, meteor DDP can support gzip, caching etc and GridFS can be faster...

这篇关于Meteor:从客户端上传文件到 Mongo 集合 vs 文件系统 vs GridFS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆