MongoDB-聚合框架(总数) [英] MongoDB - Aggregation Framework (Total Count)

查看:87
本文介绍了MongoDB-聚合框架(总数)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在MongoDB上运行普通的查找"查询时,我可以通过在返回的游标上运行计数"来获得总结果计数(不考虑限制).因此,即使我将结果集限制为10个(例如),我仍然可以知道结果总数为53个(例如,再次).

When running a normal "find" query on MongoDB I can get the total result count (regardless of limit) by running "count" on the returned cursor. So, even if I limit to result set to 10 (for example) I can still know that the total number of results was 53 (again, for example).

如果我正确理解它,那么聚合框架不会返回游标,而只会返回结果.因此,如果我使用$limit管道运算符,那么无论上述限制如何,我如何知道结果的总数?

If I understand it correctly, the aggregation framework, however, doesn't return a cursor but simply the results. And so, if I used the $limit pipeline operator, how can I know the total number of results regardless of said limit?

我想我可以运行两次聚合(一次通过$group计算结果,然后一次使用$limit来计算实际有限的结果),但这似乎效率很低.

I guess I could run the aggregation twice (once to count the results via $group, and once with $limit for the actual limited results), but this seems inefficient.

另一种方法可能是在$limit操作之前将结果总数附加到文档(通过$group),但这似乎效率不高,因为此数字将附加到每个文档(而不是仅附加到每个文档).为该集合返回一次).

An alternative approach could be to attach the total number of results to the documents (via $group) prior to the $limit operation, but this also seems inefficient as this number will be attached to every document (instead of just returned once for the set).

我在这里错过了什么吗?有任何想法吗?谢谢!

Am I missing something here? Any ideas? Thanks!

例如,如果这是查询:

db.article.aggregate(
    { $group : {
        _id : "$author",
        posts : { $sum : 1 }
    }},
    { $sort : { posts: -1 } },
    { $limit : 5 }
);

我怎么知道有多少结果可用(在$limit之前)?结果不是游标,所以我不能仅仅依靠它.

How would I know how many results are available (before $limit)? The result isn't a cursor, so I can't just run count on it.

推荐答案

Assaf,在不久的将来将对聚合框架进行一些增强,使您可以轻松进行一次计算,但是现在,最好通过并行运行两个查询来执行您的计算:一个汇总您的主要作者的#posts,另一个汇总以计算所有作者的总posts.另外,请注意,如果只需要对文档进行计数,则使用计数功能是执行计算的非常有效的方法. MongoDB在btree索引中缓存计数,从而可以对查询进行非常快速的计数.

Assaf, there's going to be some enhancements to the aggregation framework in the near future that may allow you to do your calculations in one pass easily, but right now, it is best to perform your calculations by running two queries in parallel: one to aggregate the #posts for your top authors, and another aggregation to calculate the total posts for all authors. Also, note that if all you need to do is a count on documents, using the count function is a very efficient way of performing the calculation. MongoDB caches counts within btree indexes allowing for very quick counts on queries.

如果这些聚合结果变慢,则有两种策略.首先,请记住,如果适用于减少结果集,则要以$ match开始查询. $ matches也可以通过索引加快.其次,您可以将这些计算作为预聚合进行.不再让用户每次访问应用程序的某些部分时都运行这些聚合,而是让聚合在后台定期运行并将聚合存储在包含预聚合值的集合中.这样,您的页面就可以简单地从该集合中查询预先计算的值.

If these aggregations turn out to be slow there are a couple of strategies. First off, keep in mind that you want start the query with a $match if applicable to reduce the result set. $matches can also be speed up by indexes. Secondly, you can perform these calculations as pre-aggregations. Instead of possible running these aggregations every time a user accesses some part of your app, have the aggregations run periodically in the background and store the aggregations in a collection that contains pre-aggregated values. This way, your pages can simply query the pre-calculated values from this collection.

这篇关于MongoDB-聚合框架(总数)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆