流星:从客户端上传文件到Mongo集合vs文件系统vs GridFS [英] Meteor: uploading file from client to Mongo collection vs file system vs GridFS

查看:159
本文介绍了流星:从客户端上传文件到Mongo集合vs文件系统vs GridFS的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

流星很棒,但它缺乏传统文件上传的原生支持。有几个选项可以处理文件上传:



从客户端,可以使用以下方式发送数据:


$

  • Meteor.call('saveFile',data)或collection.insert({file:data})
  • 'POST' HTTP.call('POST')



  • 在服务器中,文件可以保存到


    • 通过collection.insert({file:data})获得一个mongodb文件集合
    • 文件系统/ path / to / dir
    • mongodb GridFS



    • 这些优点和缺点是什么方法和如何最好地实施他们?我知道,也有其他选项,如保存到第三方网站,并获得一个网址。

      解决方案

      您可以实现文件上传非常简单,只需使用流星而不需要使用更多的软件包或第三方选项1:DDP,将文件保存到mongo集合中

      / *** client.js *** /

      //将更改事件分配到输入标记
      'change input':function(event,template){
      var file = event.target.files [0]; (假设只有1个文件
      if(!file)return;

      var reader = new FileReader(); //根据HTML5创建一个阅读器File API
      $ b reader.onload = function(event){
      var buffer = new Uint8Array(reader.result)//转换为二进制
      Meteor.call('saveFile',buffer);
      }

      reader.readAsArrayBuffer(file); //将文件读作arraybuffer

      $ b / *** server.js *** /

      Files = new Mongo.Collection('files') ;
      $ b $ Meteor.methods({
      'saveFile':function(buffer){
      Files.insert({data:buffer})
      }
      } );

      解释

      <首先,使用HTML5 File API从输入中抓取文件。阅读器是使用新的FileReader创建的。该文件读取为readAsArrayBuffer。这个缓冲区,如果你console.log,返回{}和DDP无法通过电线发送,所以它必须被转换为Uint8Array。



      当你把这在Meteor.call中,Meteor自动运行EJSON.stringify(Uint8Array)并用DDP发送。你可以在chrome控制台的websocket流量中检查数据,你会看到一个类似于base64的字符串。在服务器端,Meteor调用EJSON.parse()并将其转换回缓冲区
      $ b

      优点


      1. 方式,不需要额外的软件包
      2. 坚持线上数据原则

      缺点:$ b​​
      $ b


      1. 更多的带宽:由此产生的base64字符串比原始文件大33%

      2. 文件大小限制:无法发送大文件(限制〜16 MB?)
      3. 没有缓存

      4. 没有gzip或压缩

      5. 如果您发布文件,占用大量内存

        $ b $ hr

        选项2:XHR,从客户端到文件系统


        $ b

          / *** client.js *** / 

        //将一个改变事件分配到输入标记
        '改变输入':function(event,template){
        var file = event.target.files [0];
        if(!file)return;

        var xhr = new XMLHttpRequest();
        xhr.open('POST','/ uploadSomeWhere',true);
        xhr.onload = function(event){...}

        xhr.send(file);
        }

        / *** server.js *** /

        var fs = Npm.require('fs');

        //使用interal webapp或iron:router
        WebApp.connectHandlers.use('/ uploadSomeWhere',function(req,res){
        // var start = Date。 now()
        var file = fs.createWriteStream('/ path / to / dir / filename');

        file.on('error',function(error){... });
        file.on('finish',function(){
        res.writeHead(...)
        res.end(); //结束响应
        //console.log('Finish uploading,time taken:'+ Date.now() - start);
        });

        req.pipe(file); // pipe请求文件
        });

        解释

        抓取客户端中的文件,创建XHR对象,并通过POST将文件发送到服务器。

        在服务器上,数据被传送到底层文件系统。您可以另外确定文件名,执行sanitisation或检查它是否已经存在等保存之前。
        $ b

        优点


        1. 利用XHR 2,您可以发送arraybuffer,与选项1相比,不需要任何新的FileReader()。
        2. Arraybuffer是与base64字符串相比,体积更小

        3. 没有大小限制,我在本地主机发送了一个〜200 MB的文件,没有任何问题
        4. 文件系统比mongodb (稍后在基准测试中的更多内容)

        5. 可缓存和gzip

        缺点


        1. XHR 2在较旧的浏览器中不可用,例如低于IE10,但当然你可以实现一个传统的post< form>我只使用了xhr = new XMLHttpRequest(),而不是HTTP.call('POST'),因为Meteor中当前的HTTP.call还不能发送arraybuffer(指向我,如果我错了的话)。
        2. / path / to / dir /必须在meteor之外,否则在/ public中写入文件会触发重载






        选项3:保存到GridFS的XHR
        $ b

          / *** client.js *** / 

        //与选项2相同


        / ***版本A:server.js ** * /

        var db = MongoInternals.defaultRemoteCollectionDriver()。mongo.db;
        var GridStore = MongoInternals.NpmModule.GridStore;

        WebApp.connectHandlers.use('/ uploadSomeWhere',function(req,res){
        // var start = Date.now()
        var file = new GridStore db,'filename','w');

        file.open(function(error,gs){
        file.stream(true); // true会自动关闭文件一次('end',function(){
        res'') .end(); //发送结束响应
        //console.log('Finish uploading,time taken:'+ Date.now() - start);
        });

        req.pipe(file);
        });
        });
        $ b $ / ***版本B:server.js *** /

        var db = MongoInternals.defaultRemoteCollectionDriver()。mongo.db;
        var GridStore = Npm.require('mongodb')。GridStore; //也需要在package.js中添加Npm.depends({mongodb:'2.0.13'})
        $ b $ WebApp.connectHandlers.use('/ uploadSomeWhere',function(req,res){
        // var start = Date.now()
        var file = new GridStore(db,'filename','w')。stream(true); //开始流

        file.on('error',function(e){...});
        file.on('end',function(){
        res.end(); //发送结束响应
        //console.log('Finish上传,时间采取:'+ Date.now() - 开始);
        });
        req.pipe(file);
        });

        解释

        <客户端脚本与选项2中的相同。



        根据Meteor 1.0.x mongo_driver.js 最后一行,一个名为MongoInternals的全局对象被公开,你可以调用defaultRemoteCollectionDriver()来返回当前的数据库db对象所需要的Gridstore的。在版本A中,GridStore也由MongoInternals公开。当前流星使用的mongo是v1.4.x然后在一个路由里,你可以通过调用var file = new GridStore(...)创建一个新的写对象。 )( API )。然后你打开文件并创建一个流。

        我还包含了一个版本B.在这个版本中,GridStore通过Npm.require调用一个新的mongodb驱动器'mongodb'),这个mongo是本文的最新v2.0.13。新的 API 不要求您打开文件,您可以调用流(真),并开始管道



        优点


        1. 与选项2相同,使用arraybuffer发送,与选项1中的base64字符串相比,开销更小

        2. 无需担心文件名称的干扰

        3. 从文件系统分离,无需写入临时目录,数据库可以备份,代表,分片等。

        4. 不需要实现任何其他的包
        5. 可缓存并且可以被压缩

        6. 存储比通常的mongo集合大得多的大小

        7. 使用管道减少内存过载

        缺点
        $ b


        1. 不稳定的Mongo GridFS 。我包括版本A(mongo 1.x)和B(mongo 2.x)。在版本A中,当管道大于10 MB的大文件时,出现很多错误,包括损坏的文件,未完成的管道。这个问题在版本B中使用mongo 2.x解决,希望流星很快就会升级到mongodb 2.x。
        2. API confusion 。在版本A中,您需要先打开文件,然后才能进行流式处理,但在版本B中,您可以在不调用打开的情况下进行流式处理。 API文档也不是很清楚,流不是100%的语法与Npm.require('fs')交换。在fs中,您调用file.on('finish'),但是在GridFS中,在写完成/结束时调用file.on('end')。
        3. GridFS不提供写入原子性,所以如果有多个同时写入同一个文件,最终结果可能会非常不同。

        4. Speed 。 Mongo GridFS比文件系统慢得多。

        基准
        您可以在选项2和选项3,我包括var start = Date.now()和写入结束时,我console.log出在毫秒的时间,下面是结果。双核,4 GB内存,硬盘,基于Ubuntu 14.04。

        pre $文件大小GridFS FS
        100 KB 50 2
        1 MB 400 30
        10 MB 3500 100
        200 MB 80000 1240



        <你可以看到FS比GridFS快得多。对于一个200 MB的文件,使用GridFS需要约80秒,但FS只需要约1秒。我没有试过SSD,结果可能会有所不同。然而在现实生活中,带宽可能会决定文件从客户端传输到服务器的速度有多快,达到200 MB /秒的传输速度并不典型。另一方面,传输速度~2MB /秒(GridFS)更为常见。
        $ b

        结论



        这并不意味着这是全面的,但你可以决定哪个选项最适合你的需要。




        • DDP是最简单的,坚持流星的核心原理,但是数据更加庞大,在传输过程中不可压缩,不可缓存。但是如果你只需要小文件的话,这个选项可能是好的。
        • 与文件系统相结合的XHR是传统的方式。稳定的API,快速的,可流动的,可压缩的,可高速缓存的(ETag等),但是需要放在一个单独的文件夹中
        • XHR加上GridFS rep集的好处,可扩展性,没有触及的文件系统目录,大文件和许多文件,如果文件系统限制数字,也可超可压缩。但是,API不稳定,您在多次写入时遇到错误,它是s..l..o..w ..



        希望不久的将来,流星DDP可以支持gzip,缓存等,并且GridFS可以更快速度...

        ...

        Meteor is great but it lacks native supports for traditional file uploading. There are several options to handle file uploading:

        From the client, data can be sent using:

        • Meteor.call('saveFile',data) or collection.insert({file:data})
        • 'POST' form or HTTP.call('POST')

        In the server, the file can be saved to:

        • a mongodb file collection by collection.insert({file:data})
        • file system in /path/to/dir
        • mongodb GridFS

        What are the pros and cons for these methods and how best to implement them? I am aware that there are also other options such as saving to a third party site and obtain an url.

        解决方案

        You can achieve file uploading quite simply with Meteor without using any more packages or a third party

        Option 1: DDP, saving file to a mongo collection

        /*** client.js ***/
        
        // asign a change event into input tag
        'change input' : function(event,template){ 
            var file = event.target.files[0]; //assuming 1 file only
            if (!file) return;
        
            var reader = new FileReader(); //create a reader according to HTML5 File API
        
            reader.onload = function(event){          
              var buffer = new Uint8Array(reader.result) // convert to binary
              Meteor.call('saveFile', buffer);
            }
        
            reader.readAsArrayBuffer(file); //read the file as arraybuffer
        }
        
        /*** server.js ***/ 
        
        Files = new Mongo.Collection('files');
        
        Meteor.methods({
            'saveFile': function(buffer){
                Files.insert({data:buffer})         
            }   
        });
        

        Explantion

        First, the file is grabbed from the input using HTML5 File API. A reader is created using new FileReader. The file is read as readAsArrayBuffer. This arraybuffer, if you console.log, returns {} and DDP can't send this over the wire, so it has to be converted to Uint8Array.

        When you put this in Meteor.call, Meteor automatically runs EJSON.stringify(Uint8Array) and sends it with DDP. You can check the data in chrome console websocket traffic, you will see a string resembling base64

        On the server side, Meteor call EJSON.parse() and converts it back to buffer

        Pros

        1. Simple, no hacky way, no extra packages
        2. Stick to the Data on the Wire principle

        Cons

        1. More bandwidth: the resulting base64 string is ~ 33% larger than the original file
        2. File size limit: can't send big files (limit ~ 16 MB?)
        3. No caching
        4. No gzip or compression yet
        5. Take up lots of memory if you publish files


        Option 2: XHR, post from client to file system

        /*** client.js ***/
        
        // asign a change event into input tag
        'change input' : function(event,template){ 
            var file = event.target.files[0]; 
            if (!file) return;      
        
            var xhr = new XMLHttpRequest(); 
            xhr.open('POST', '/uploadSomeWhere', true);
            xhr.onload = function(event){...}
        
            xhr.send(file); 
        }
        
        /*** server.js ***/ 
        
        var fs = Npm.require('fs');
        
        //using interal webapp or iron:router
        WebApp.connectHandlers.use('/uploadSomeWhere',function(req,res){
            //var start = Date.now()        
            var file = fs.createWriteStream('/path/to/dir/filename'); 
        
            file.on('error',function(error){...});
            file.on('finish',function(){
                res.writeHead(...) 
                res.end(); //end the respone 
                //console.log('Finish uploading, time taken: ' + Date.now() - start);
            });
        
            req.pipe(file); //pipe the request to the file
        });
        

        Explanation

        The file in the client is grabbed, an XHR object is created and the file is sent via 'POST' to the server.

        On the server, the data is piped into an underlying file system. You can additionally determine the filename, perform sanitisation or check if it exists already etc before saving.

        Pros

        1. Taking advantage of XHR 2 so you can send arraybuffer, no new FileReader() is needed as compared to option 1
        2. Arraybuffer is less bulky compared to base64 string
        3. No size limit, I sent a file ~ 200 MB in localhost with no problem
        4. File system is faster than mongodb (more of this later in benchmarking below)
        5. Cachable and gzip

        Cons

        1. XHR 2 is not available in older browsers, e.g. below IE10, but of course you can implement a traditional post <form> I only used xhr = new XMLHttpRequest(), rather than HTTP.call('POST') because the current HTTP.call in Meteor is not yet able to send arraybuffer (point me if I am wrong).
        2. /path/to/dir/ has to be outside meteor, otherwise writing a file in /public triggers a reload


        Option 3: XHR, save to GridFS

        /*** client.js ***/
        
        //same as option 2
        
        
        /*** version A: server.js ***/  
        
        var db = MongoInternals.defaultRemoteCollectionDriver().mongo.db;
        var GridStore = MongoInternals.NpmModule.GridStore;
        
        WebApp.connectHandlers.use('/uploadSomeWhere',function(req,res){
            //var start = Date.now()        
            var file = new GridStore(db,'filename','w');
        
            file.open(function(error,gs){
                file.stream(true); //true will close the file automatically once piping finishes
        
                file.on('error',function(e){...});
                file.on('end',function(){
                    res.end(); //send end respone
                    //console.log('Finish uploading, time taken: ' + Date.now() - start);
                });
        
                req.pipe(file);
            });     
        });
        
        /*** version B: server.js ***/  
        
        var db = MongoInternals.defaultRemoteCollectionDriver().mongo.db;
        var GridStore = Npm.require('mongodb').GridStore; //also need to add Npm.depends({mongodb:'2.0.13'}) in package.js
        
        WebApp.connectHandlers.use('/uploadSomeWhere',function(req,res){
            //var start = Date.now()        
            var file = new GridStore(db,'filename','w').stream(true); //start the stream 
        
            file.on('error',function(e){...});
            file.on('end',function(){
                res.end(); //send end respone
                //console.log('Finish uploading, time taken: ' + Date.now() - start);
            });
            req.pipe(file);
        });     
        

        Explanation

        The client script is the same as in option 2.

        According to Meteor 1.0.x mongo_driver.js last line , a global object called MongoInternals is exposed, you can call defaultRemoteCollectionDriver() to return the current database db object which is required for the GridStore. In version A, the GridStore is also exposed by the MongoInternals. The mongo used by current meteor is v1.4.x

        Then inside a route, you can create a new write object by calling var file = new GridStore(...) (API). You then open the file and create a stream.

        I also included a version B. In this version, the GridStore is called using a new mongodb drive via Npm.require('mongodb'), this mongo is the latest v2.0.13 as of this writing. The new API doesn't require you to open the file, you can call stream(true) directly and start piping

        Pros

        1. Same as in option 2, sent using arraybuffer, less overhead compared to base64 string in option 1
        2. No need to worry about file name sanitisation
        3. Separation from file system, no need to write to temp dir, the db can be backed up, rep, shard etc
        4. No need to implement any other package
        5. Cachable and can be gzipped
        6. Store much larger sizes compared to normal mongo collection
        7. Using pipe to reduce memory overload

        Cons

        1. Unstable Mongo GridFS. I included version A (mongo 1.x) and B (mongo 2.x). In version A, when piping large files > 10 MB, I got lots of error, including corrupted file, unfinished pipe. This problem is solved in version B using mongo 2.x, hopefully meteor will upgrade to mongodb 2.x soon
        2. API confusion. In version A, you need to open the file before you can stream, but in version B, you can stream without calling open. The API doc is also not very clear and the stream is not 100% syntax exchangeable with Npm.require('fs'). In fs, you call file.on('finish') but in GridFS you call file.on('end') when writing finishes/ends.
        3. GridFS doesn't provide write atomicity, so if there are multiple concurrent writes to the same file, the final result may be very different
        4. Speed. Mongo GridFS is much slower than file system.

        Benchmark You can see in option 2 and option 3, I included var start = Date.now() and when writing end, I console.log out the time in ms, below is the result. Dual Core, 4 GB ram, HDD, ubuntu 14.04 based.

        file size   GridFS  FS
        100 KB      50      2
        1 MB        400     30
        10 MB       3500    100
        200 MB      80000   1240
        

        You can see that FS is much faster than GridFS. For a file of 200 MB, it takes ~ 80 sec using GridFS but only ~ 1 sec in FS. I haven't tried SSD, the result may be different. However in real life, the bandwidth may dictate how fast the file is streamed from client to server, achieving 200 MB/sec transfer speed is not typical. On the other hand, a transfer speed ~2 MB/sec (GridFS) is more the norm.

        Conclusion

        By no mean this is comprehensive, but you can decide which option is best for your need.

        • DDP is the simplest and sticks to the core Meteor principle but the data are more bulky, not compressible during transfer, not cachable. But this option may be good if you only need small files.
        • XHR coupled with file system is the 'traditional' way. Stable API, fast, 'streamable', compressible, cachable (ETag etc), but needs to be in a separate folder
        • XHR coupled with GridFS, you get the benefit of rep set, scalable, no touching file system dir, large files and many files if file system restricts the numbers, also cachable compressible. However, the API is unstable, you get errors in multiple writes, it's s..l..o..w..

        Hopefully soon, meteor DDP can support gzip, caching etc and GridFS can be faster...

        这篇关于流星:从客户端上传文件到Mongo集合vs文件系统vs GridFS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆