MongoDB:groupby子文档和计数+添加总计数 [英] MongoDB: groupby subdocument and count + add total count

查看:361
本文介绍了MongoDB:groupby子文档和计数+添加总计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设具有以下子文档:

{
    "id":1,
    "url":"mysite.com",
    "views": 
     [
       {"ip":"1.1.1.1","date":"01-01-2015"},
       {"ip":"2.2.2.2","date":"01-01-2015"},
       {"ip":"1.1.1.1","date":"01-01-2015"},
       {"ip":"1.1.1.1","date":"01-01-2015"}
     ]
}

我要数:

  1. 根据"ip"值有多少个IP
  2. 还计算"views"
  3. 中的子文档总数
  1. how many IPs there are based on the "ip" value
  2. and also count the total of subdocuments in "views"

在可能的情况下在同一查询中,以达到以下结果:

if possible in the same query, to achieve the following result:

[  
  {  
    "_id":"2.2.2.2",
    "count":1
  },
  {  
    "_id":"1.1.1.1",
    "count":3
  },
  {  
    "_id":"total",
    "count":4
  }
]

到目前为止我取得的成就

使用 MongoDB聚合框架,我已经设法通过以下方式达到点 1.:

What I have achieved so far

Using the MongoDB Aggregation Framework I have managed to achieve point 1. the following way:

db.collection.aggregate([
    {
        "$unwind": "$views"
    },
    {
        "$group": {
            "_id": "$views.ip",
            "count": {
                "$sum": 1
            }
        }
    }
])

返回:

[  
  {  
    "_id":"2.2.2.2",
    "count":1
  },
  {  
    "_id":"1.1.1.1",
    "count":3
  }
]

我希望在数组中返回该额外的文档,该文档将是:

I wish to return that extra doc inside the array, that would be:

{  
  "_id":"total",
  "count":4
}

要实现我在上面公开的内容,但是我被困在那里并且一直无法做到.

to achieve what I exposed above, but I am stuck there and haven't been able to do so.

推荐答案

在同一个聚合管道中,原则上不可能,因为管道在通过它们时会处理文档,即管道阶段不需要为每个阶段生成一个输出文档输入文件;例如,某些阶段可能会生成新文档或过滤掉文档.在上述情况下,添加另一个 $group 步骤来获取分组的IP计数+总计数将产生与您所追求的结果不同的结果,即

Not possible within the same aggregation pipeline as in principle the pipeline processes documents as they pass through it i.e. the pipeline stages do not need to produce one output document for every input document; e.g., some stages may generate new documents or filter out documents. In the above scenario, adding another $group step to get the grouped IP counts + total count would produce a different result to what you are after i.e.

db.collection.aggregate([
    {
        "$unwind": "$views"
    },
    {
        "$group": {
            "_id": "$views.ip",
            "count": {
                "$sum": 1
            }
        }
    },
    {
        "$group": {
            "_id": null,
            "total": {
                "$sum": "$count"
            }
        }
    }
])

由于 $group 使用所有输入文档(具有IP计数分组的文档),并在每个不同的组中输出一个文档.该额外的分组步骤将对上一个流中的所有文档进行分组.

You will only get the total count since the $group consumes all input documents (documents with the grouped IP counts) and outputs one document per each distinct group. That extra group step will group all the documents from the previous stream.

但是,您可以得到总数,但可以作为最终结果中每个分组文档中的一个额外字段.以下示例使用初始 $project 管道阶段,以通过 $size 运算符可完成此操作:

However, you could get the total count but as an extra field within each grouped document in your final result. The following example which uses an initial $project pipeline stage to get the total count via the $size operator accomplishes this:

db.collection.aggregate([
    {
        "$project": {
            "views": 1,
            "views_size": { "$size": "$views" }
        }
    }
    {
        "$unwind": "$views"
    },
    {
        "$group": {
            "_id": "$views.ip",
            "count": {
                "$sum": 1
            },
            "total": { "$first": "$views_size" }
        }
    }
])

示例输出

[  
  {  
    "_id": "2.2.2.2",
    "count": 1,
    "total": 4
  },
  {  
    "_id": "1.1.1.1",
    "count": 3,
    "total": 4
  }
]

这篇关于MongoDB:groupby子文档和计数+添加总计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆