如何将100万条记录异步保存到mongodb? [英] How to save 1 million records to mongodb asyncronously?

查看:479
本文介绍了如何将100万条记录异步保存到mongodb?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用如下JavaScript将100万条记录保存到mongodb:

I want to save 1 million records to mongodb using javascript like this:

for (var i = 0; i<10000000; i++) {
  model = buildModel(i);
  db.save(model, function(err, done) {
    console.log('cool');
  });
}

我尝试过,它保存了约160条记录,然后挂2分钟,然后退出.为什么?

I tried it, it saved ~160 records, then hang for 2 minutes, then exited. Why?

推荐答案

它崩溃了,因为在继续进行下一个迭代之前,您没有等待异步调用完成.这意味着您正在构建未解决的操作的堆栈",直到这引起问题为止.这个网站的名称又是什么?得到照片吗?

It blew up because you are not waiting for an asynchronous call to complete before moving on to the next iteration. What this means is that you are building a "stack" of unresolved operations until this causes a problem. What is the name of this site again? Get the picture?

因此,这不是进行批量" 的最佳方法.插入.幸运的是,除了前面提到的回调问题之外,底层的MongoDB驱动程序已经考虑了这一点.实际上,可以使用批量API" 来完成很多工作更好的.并假设您已经将本机驱动程序作为db对象.但是我更喜欢仅使用模型中的.collection访问器,并使用"async" 模块来制作所有内容清除:

So this is not the best way to proceed with "Bulk" insertions. Fortunately the underlying MongoDB driver has already thought about this, aside from the callback issue mentioned earlier. There is in fact a "Bulk API" available to make this a whole lot better. And assuming you already pulled the native driver as the db object. But I prefer just using the .collection accessor from the model, and the "async" module to make everything clear:

var bulk = Model.collection.initializeOrderedBulkOp();
var counter = 0;

async.whilst(
  // Iterator condition
  function() { return count < 1000000 },

  // Do this in the iterator
  function(callback) {
    counter++;
    var model = buildModel(counter);
    bulk.insert(model);

    if ( counter % 1000 == 0 ) {
      bulk.execute(function(err,result) {
        bulk = Model.collection.initializeOrderedBulkOp();
        callback(err);
      });
    } else {
      callback();
    }
  },

  // When all is done
  function(err) {
    if ( counter % 1000 != 0 ) 
        bulk.execute(function(err,result) {
           console.log( "inserted some more" );
        });        
    console.log( "I'm finished now" ;
  }
);

两者的区别是在完成时使用异步"回调方法,而不是仅建立堆栈,还使用批量操作API"以通过在1000的批处理更新语句中提交所有内容来减轻异步写入调用项.

The difference there is using both "asynchronous" callback methods on completion rather that just building up a stack, but also employing the "Bulk Operations API" in order to mitigate the asynchronous write calls by submitting everything in batch update statements of 1000 entries.

这不仅不像您自己的示例代码那样建立一个堆栈"的函数执行,而且还通过不发送所有单独语句中的所有内容,而是分解为可管理的批处理"来执行有效的有线"事务.服务器承诺.

This does not not only not "build up a stack" of function execution like your own example code, but also performs efficient "wire" transactions by not sending everything all in individual statements, but rather breaking up into manageable "batches" for server commitment.

这篇关于如何将100万条记录异步保存到mongodb?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆