MongoDB Count()与聚合 [英] MongoDB Count() vs. Aggregation

查看:828
本文介绍了MongoDB Count()与聚合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经在mongo中使用了很多聚合,我知道分组计数等方面的性能优势.但是,mongo在计算集合中所有文档的这两种方式上在性能上有什么区别吗?:

I've used aggregation in mongo a lot, I know performance benefits on the grouped counts and etc. But, do mongo have any difference in performance on those two ways to count all documents in a collection?:

collection.aggregate([
  {
    $match: {}
  },{
    $group: {
      _id: null, 
      count: {$sum: 1}
    }
}]);

collection.find({}).count()

更新:第二种情况: 假设我们有以下示例数据:

Update: Second case: Let's say we have this sample data:

{_id: 1, type: 'one', value: true}
{_id: 2, type: 'two', value: false}
{_id: 4, type: 'five', value: false}

使用aggregate():

var _ids = ['id1', 'id2', 'id3'];
var counted = Collections.mail.aggregate([
  {
    '$match': {
      _id: {
        '$in': _ids
      },
      value: false
    }
  }, {
    '$group': {
      _id: "$type",
      count: {
        '$sum': 1
      }
    }
  }
]);

使用count():

var counted = {};
var type = 'two';
for (i = 0, len = _ids.length; i < len; i++) {
  counted[_ids[i]] = Collections.mail.find({
    _id: _ids[i], value: false, type: type
  }).count();
}

推荐答案

.count()快得多.您可以通过调用

.count() is by far faster. You can see the implementation by calling

// Note the missing parentheses at the end
db.collection.count

返回光标的长度.默认查询(如果在没有查询文档的情况下调用了count()),则该查询又被实现为返回_id_索引iirc的长度.

which returns the length of the cursor. of the default query (if count() is called with no query document), which in turn is implemented as returning the length of the _id_ index, iirc.

但是,聚合将读取每个文档并进行处理.当仅对大约100k文档进行处理时,这只能与.count()处于相同数量级(给出并根据您的RAM取值).

An aggregation, however, reads each and every document and processes it. This can only be halfway in the same order of magnitude with .count() when doing it over only some 100k of documents (give and take according to your RAM).

以下功能已应用于具有大约1200万个条目的集合:

Below function was applied to a collection with some 12M entries:

function checkSpeed(col,iterations){

  // Get the collection
  var collectionUnderTest = db[col];

  // The collection we are writing our stats to
  var stats = db[col+'STATS']

  // remove old stats
  stats.remove({})

  // Prevent allocation in loop
  var start = new Date().getTime()
  var duration = new Date().getTime()

  print("Counting with count()")
  for (var i = 1; i <= iterations; i++){
    start = new Date().getTime();
    var result = collectionUnderTest.count()
    duration = new Date().getTime() - start
    stats.insert({"type":"count","pass":i,"duration":duration,"count":result})
  }

  print("Counting with aggregation")
  for(var j = 1; j <= iterations; j++){
    start = new Date().getTime()
    var doc = collectionUnderTest.aggregate([{ $group:{_id: null, count:{ $sum: 1 } } }])
    duration = new Date().getTime() - start
    stats.insert({"type":"aggregation", "pass":j, "duration": duration,"count":doc.count})
  }

  var averages = stats.aggregate([
   {$group:{_id:"$type","average":{"$avg":"$duration"}}} 
  ])

  return averages
}

并返回:

{ "_id" : "aggregation", "average" : 43828.8 }
{ "_id" : "count", "average" : 0.6 }

单位是毫秒.

hth

这篇关于MongoDB Count()与聚合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆