MongoDB - 聚合框架(总数) [英] MongoDB - Aggregation Framework (Total Count)

查看:24
本文介绍了MongoDB - 聚合框架(总数)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 MongoDB 上运行普通的查找"查询时,我可以通过在返回的游标上运行计数"来获得总结果计数(不考虑限制).因此,即使我将结果集限制为 10(例如),我仍然可以知道结果总数为 53(例如).

When running a normal "find" query on MongoDB I can get the total result count (regardless of limit) by running "count" on the returned cursor. So, even if I limit to result set to 10 (for example) I can still know that the total number of results was 53 (again, for example).

如果我理解正确,聚合框架不会返回游标,而只是返回结果.因此,如果我使用 $limit 管道运算符,我如何知道结果总数而不管所述限制如何?

If I understand it correctly, the aggregation framework, however, doesn't return a cursor but simply the results. And so, if I used the $limit pipeline operator, how can I know the total number of results regardless of said limit?

我想我可以运行两次聚合(一次通过 $group 计算结果,一次使用 $limit 计算实际有限的结果),但这似乎低效.

I guess I could run the aggregation twice (once to count the results via $group, and once with $limit for the actual limited results), but this seems inefficient.

另一种方法是在 $limit 操作之前将结果总数附加到文档(通过 $group),但这似乎也效率低下此编号将附加到每个文档中(而不是只为集合返回一次).

An alternative approach could be to attach the total number of results to the documents (via $group) prior to the $limit operation, but this also seems inefficient as this number will be attached to every document (instead of just returned once for the set).

我在这里遗漏了什么吗?有任何想法吗?谢谢!

Am I missing something here? Any ideas? Thanks!

例如,如果这是查询:

db.article.aggregate(
    { $group : {
        _id : "$author",
        posts : { $sum : 1 }
    }},
    { $sort : { posts: -1 } },
    { $limit : 5 }
);

我如何知道有多少结果可用(在 $limit 之前)?结果不是游标,所以我不能仅仅依靠它.

How would I know how many results are available (before $limit)? The result isn't a cursor, so I can't just run count on it.

推荐答案

Assaf,在不久的将来会对聚合框架进行一些增强,让您可以轻松地一次性完成计算,但是现在,最好通过并行运行两个查询来执行计算:一个是聚合顶级作者的#posts,另一个聚合是计算所有作者的总帖子数.另请注意,如果您需要做的只是对文档进行计数,则使用 count 函数是执行计算的一种非常有效的方法.MongoDB 在 btree 索引中缓存计数,允许对查询进行非常快速的计数.

Assaf, there's going to be some enhancements to the aggregation framework in the near future that may allow you to do your calculations in one pass easily, but right now, it is best to perform your calculations by running two queries in parallel: one to aggregate the #posts for your top authors, and another aggregation to calculate the total posts for all authors. Also, note that if all you need to do is a count on documents, using the count function is a very efficient way of performing the calculation. MongoDB caches counts within btree indexes allowing for very quick counts on queries.

如果这些聚合结果很慢,则有几种策略.首先,请记住,如果适用,您希望以 $match 开始查询以减少结果集.$matches 也可以通过索引加速.其次,您可以将这些计算作为预聚合来执行.不要在用户每次访问应用程序的某些部分时都运行这些聚合,而是让聚合在后台定期运行,并将聚合存储在包含预聚合值的集合中.这样,您的页面就可以简单地从该集合中查询预先计算的值.

If these aggregations turn out to be slow there are a couple of strategies. First off, keep in mind that you want start the query with a $match if applicable to reduce the result set. $matches can also be speed up by indexes. Secondly, you can perform these calculations as pre-aggregations. Instead of possible running these aggregations every time a user accesses some part of your app, have the aggregations run periodically in the background and store the aggregations in a collection that contains pre-aggregated values. This way, your pages can simply query the pre-calculated values from this collection.

这篇关于MongoDB - 聚合框架(总数)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆