数组内的多个嵌套组 [英] Multiple Nested Group Within Array

查看:112
本文介绍了数组内的多个嵌套组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在MongoDB中有一组元素,如下所示:

I'm having group of elements in MongoDB as given below:

/* 1 */
{
    "_id" : ObjectId("58736c7f7d43c305461cdb9b"),
    "Name" : "Kevin",
    "pb_event" : [ 
        {
            "event_type" : "Birthday",
            "event_date" : "2014-08-31"
        }, 
        {
            "event_type" : "Anniversary",
            "event_date" : "2014-08-31"
        }
    ]
}

/* 2 */
{
    "_id" : ObjectId("58736cfc7d43c305461cdba8"),
    "Name" : "Peter",
    "pb_event" : [ 
        {
            "event_type" : "Birthday",
            "event_date" : "2014-08-31"
        }, 
        {
            "event_type" : "Anniversary",
            "event_date" : "2015-03-24"
        }
    ]
}

/* 3 */
{
    "_id" : ObjectId("58736cfc7d43c305461cdba9"),
    "Name" : "Pole",
    "pb_event" : [ 
        {
            "event_type" : "Birthday",
            "event_date" : "2015-03-24"
        }, 
        {
            "event_type" : "Work Anniversary",
            "event_date" : "2015-03-24"
        }
    ]
}

现在,我希望在event_date上具有分组的结果,然后在event_type上具有分组的结果. event_type包含相关用户的所有名称,然后包含各个数组中的记录数.

Now I want the result that has group on event_date then after group on event_type. event_type contain all names of the related user, then count of records in the respective array.

预期产量

/* 1 */
{    
    "event_date" : "2014-08-31",
    "data" : [ 
        {
            "event_type" : "Birthday",
            "details" : [ 
                {
                    "_id" : ObjectId("58736c7f7d43c305461cdb9b"),
                    "name" : "Kevin"
                }, 
                {
                    "_id" : ObjectId("58736cfc7d43c305461cdba8"),
                    "name" : "Peter"
                }
            ],
            "count" : 2
        }, 
        {
            "event_type" : "Anniversary",
            "details" : [ 
                {
                    "_id" : ObjectId("58736c7f7d43c305461cdb9b"),
                    "name" : "Kevin"
                }
            ],
            "count" : 1
        }
    ]
}

/* 2 */
{
    "event_date" : "2015-03-24",
    "data" : [ 
        {
            "event_type" : "Anniversary",
            "details" : [ 
                {
                    "_id" : ObjectId("58736cfc7d43c305461cdba8"),
                    "name" : "Peter"
                }
            ],
            "count" : 1
        }, 
        {
            "event_type" : "Birthday",
            "details" : [ 
                {
                    "_id" : ObjectId("58736cfc7d43c305461cdba9"),
                    "name" : "Pole"
                }
            ],
            "count" : 1
        }, 
        {
            "event_type" : "Work Anniversary",
            "details" : [ 
                {
                    "_id" : ObjectId("58736cfc7d43c305461cdba9"),
                    "name" : "Pole"
                }
            ],
            "count" : 1
        }
    ]
}

推荐答案

使用聚合框架,您需要运行具有以下阶段的管道,以便获得所需的结果:

Using the aggregation framework, you would need to run a pipeline that has the following stages so that you get the desired result:

db.collection.aggregate([
    { "$unwind": "$pb_event" },
    {
        "$group": {
            "_id": {
                "event_date": "$pb_event.event_date",
                "event_type": "$pb_event.event_type" 
            },            
            "details": {
                "$push": {
                    "_id": "$_id",
                    "name": "$Name"
                }
            },
            "count": { "$sum": 1 }            
        }
    },    
    {
        "$group": {
            "_id": "$_id.event_date",            
            "data": {
                "$push": {
                    "event_type": "$_id.event_type",
                    "details": "$details",
                    "count": "$count"
                }
            }           
        }
    },
    {
        "$project": {
            "_id": 0,
            "event_date": "$_id",
            "data": 1
        }
    }
])


在上述管道中,第一步是


In the above pipeline, the first step is the $unwind operator

{ "$unwind": "$pb_event" }

在将数据存储为数组时非常方便.将展开运算符应用于列表数据字段时,它将为应用展开的列表数据字段的每个元素生成一条新记录.基本上可以使数据变平整.

which comes in quite handy when the data is stored as an array. When the unwind operator is applied on a list data field, it will generate a new record for each and every element of the list data field on which unwind is applied. It basically flattens the data.

这是下一个管道阶段的必要操作,

This is a necessary operation for the next pipeline stage, the $group step where you group the flattened documents by the deconstructed pb_event array fields event_date and event_type:

{
    "$group": {
        "_id": {
            "event_date": "$pb_event.event_date",
            "event_type": "$pb_event.event_type" 
        },            
        "details": {
            "$push": {
                "_id": "$_id",
                "name": "$Name"
            }
        },
        "count": { "$sum": 1 }            
    }
},

$group 管道运算符类似于SQL的GROUP BY子句.在SQL中,除非使用任何聚合函数,否则不能使用GROUP BY.同样,您还必须在MongoDB中使用聚合函数(称为累加器运算符).您可以在此处了解更多信息

The $group pipeline operator is similar to the SQL's GROUP BY clause. In SQL, you can't use GROUP BY unless you use any of the aggregation functions. The same way, you have to use an aggregation function in MongoDB (called an accumulator operator) as well. You can read more about the aggregation functions here.

在此 $group 操作,即使用

In this $group operation, the logic to calculate the count aggregate i.e. the total number of documents in the group using the $sum accumulator operator. Within the same pipeline, you can aggregate a list of the name and _id subdocuments by using the $push operator which returns an array of expression values for each group.

之前的 $group 管道

{
    "$group": {
        "_id": "$_id.event_date",            
        "data": {
            "$push": {
                "event_type": "$_id.event_type",
                "details": "$details",
                "count": "$count"
            }
        }           
    }
}

将通过对event_date进行分组来进一步汇总来自最后一个管道的结果,这可以通过使用

will further aggregate the results from the last pipeline by grouping on the event_date, which forms basis of the desired output by creating a new data list using $push and then the final $project pipeline stage

{
    "$project": {
        "_id": 0,
        "event_date": "$_id",
        "data": 1
    }
}

通过将_id字段重命名为event_date并保留另一个字段来重塑文档字段.

reshapes the documents fields by renaming the _id field to event_date and retaining the other field.

这篇关于数组内的多个嵌套组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆