通过组数获取$ group结果 [英] Obtaining $group result with group count

查看:63
本文介绍了通过组数获取$ group结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个名为"posts"的集合(实际上这是一个更复杂的集合,posts太简单了),结构如下:

Assuming I have a collection called "posts" (in reality it is a more complex collection, posts is too simple) with the following structure:

> db.posts.find()

{ "_id" : ObjectId("50ad8d451d41c8fc58000003"), "title" : "Lorem ipsum", "author" : 
"John Doe", "content" : "This is the content", "tags" : [ "SOME", "RANDOM", "TAGS" ] }

我希望这个集合跨越数十万个,甚至数百万个,我需要按标签查询帖子,并按标签对结果进行分组并显示分页的结果.这就是聚合框架的用处.我计划使用aggregate()方法来查询集合:

I expect this collection to span hundreds of thousands, perhaps millions, that I need to query for posts by tags and group the results by tag and display the results paginated. This is where the aggregation framework comes in. I plan to use the aggregate() method to query the collection:

db.posts.aggregate([
  { "$unwind" : "$tags" },
  { "$group" : {
      _id: { tag: "$tags" },
      count: { $sum: 1 }
  } }
]);

要注意的是,要创建分页器,我需要知道输出数组的长度.我知道可以做到这一点:

The catch is that to create the paginator I would need to know the length of the output array. I know that to do that you can do:

db.posts.aggregate([
  { "$unwind" : "$tags" },
  { "$group" : {
      _id: { tag: "$tags" },
      count: { $sum: 1 }
  } }
  { "$group" : {
      _id: null,
      total: { $sum: 1 }
  } }
]);

但是这将丢弃先前管道(第一组)的输出.在保留每个管道的输出的同时,有两种方法可以合并在一起的方法吗?我知道可以将整个聚合操作的输出转换为某种语言的数组,并对内容进行计数,但是管道输出可能会超过16Mb的限制.另外,仅仅为了获得计数而执行相同的查询似乎是一种浪费.

But that would discard the output from previous pipeline (the first group). Is there a way that the two operations be combined while preserving each pipeline's output? I know that the output of the whole aggregate operation can be cast to an array in some language and have the contents counted but there may be a possibility that the pipeline output may exceed the 16Mb limit. Also, performing the same query just to obtain the count seems like a waste.

那么可以同时获得文件结果并计数吗?感谢您的帮助.

So is obtaining the document result and count at the same time possible? Any help is appreciated.

推荐答案

  1. 使用$projecttagcount保存到tmp
  2. 使用$pushaddToSettmp存储到您的data列表中.
  1. Use $project to save tag and count into tmp
  2. Use $push or addToSet to store tmp into your data list.

代码:

db.test.aggregate(
    {$unwind: '$tags'}, 
    {$group:{_id: '$tags', count:{$sum:1}}},
    {$project:{tmp:{tag:'$_id', count:'$count'}}}, 
    {$group:{_id:null, total:{$sum:1}, data:{$addToSet:'$tmp'}}}
)

输出:

{
    "result" : [
            {
                    "_id" : null,
                    "total" : 5,
                    "data" : [
                            {
                                    "tag" : "SOME",
                                    "count" : 1
                            },
                            {
                                    "tag" : "RANDOM",
                                    "count" : 2
                            },
                            {
                                    "tag" : "TAGS1",
                                    "count" : 1
                            },
                            {
                                    "tag" : "TAGS",
                                    "count" : 1
                            },
                            {
                                    "tag" : "SOME1",
                                    "count" : 1
                            }
                      ]
              }
      ],
      "ok" : 1
}

这篇关于通过组数获取$ group结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆