很快MongoDB中插入大量的记录时,添加相同的元数据? [英] Quickly add identical metadata when inserting thousands of records in MongoDB?

查看:290
本文介绍了很快MongoDB中插入大量的记录时,添加相同的元数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想<一个href=\"https://docs.mongodb.org/manual/reference/method/db.collection.insert/#insert-multiple-documents\"相对=nofollow>插入阵列有成千上万个对象到MongoDB的集合。

I want to insert an array with thousands of objects into a MongoDB collection.

db.col.insert(
   [
     { },
     { },
     { } // A couple of 1000s more
   ],
   {
       ordered : false,
       writeConcern : 0
   }
);

不过,我也想找出使用​​元数据这些组。需要从阵列中的每个记录对分配了一些数据,这些数据是用于阵列中的所有记录相同。

However, I also want to identify these groups using metadata. Every record from the array needs to have some data assigned, and this data is identical for all records in the array.

有没有一种方法,我可以插入的所有文件,并还设置例如所有文档:

Is there a way I can insert all documents, and for all documents also set e.g.:

{
    dateTime : '111111111',
    groupId  : 'some hash',
    batchId  : 'other hash'
}

手动在阵列中手动添加这成千上万的记录?这将是一个很大的性能下降(和公正的丑陋)。

without manually adding this to the thousands of records in the array manually? That would be a big performance drop (and just ugly).

我用这些记录添加为一个阵列与元数据:

I used to add these records as one array with the metadata:

{
    dateTime : '111111111',
    groupId  : 'some hash',
    batchId  : 'other hash',
    batchArr : [ array with thousands of records]
}

和使用 $放松就可以了。然而,这是不再可能,因为记录的数量正在开始超过MongoDB的 16 MB BSON大小限制

and use $unwind on it. However, this is no longer possible because the the number of records are starting to exceed MongoDB's 16 MB BSON size limit.

推荐答案

这将是在大宗原料药 操作。有两种类型的批量操作的

This would be a pretty good candidate for the Bulk API operations. There are two types of bulk operations:


  • 有序批量操作即可。这些操作顺序执行的所有操作和错误出在第一次写入错误。

  • 无序批量操作即可。这些操作并行执行的所有操作,并聚集了所有的差错。无序批量操作不保证执行顺序。

  • Ordered bulk operations. These operations execute all the operation in order and error out on the first write error.
  • Unordered bulk operations. These operations execute all the operations in parallel and aggregates up all the errors. Unordered bulk operations do not guarantee order of execution.

考虑初始化一个 散装() 运营商,并添加了一系列插入操作的添加多个文档批次,从而简化你的表现:

Consider initialising a Bulk() operations builder and add a series of insert operations to add multiple documents in batches, thereby streamlining your performance:

var bulk = db.col.initializeOrderedBulkOp(),
    objectList = [{}, {}, ..., {}], // array with thousands of records 
    counter = 0,
    metadata = {
        dateTime : '111111111',
        groupId  : 'some hash',
        batchId  : 'other hash'
    };      

objectList.forEach(function(obj) {
    obj["dateTime"] = metadata.dateTime;
    obj["groupId"] = metadata.groupId;
    obj["batchId"] = metadata.batchId;

    bulk.insert(obj);
    counter++;

    if (counter % 500 == 0) {
        bulk.execute();             
        bulk = db.col.initializeOrderedBulkOp();            
    }
});

// Catch any under or over the 500's
if (counter % 500 != 0) {
    bulk.execute();
}

这篇关于很快MongoDB中插入大量的记录时,添加相同的元数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆