数组内的多个嵌套组 [英] Multiple Nested Group Within Array
问题描述
我在MongoDB中有一组元素,如下所示:
I'm having group of elements in MongoDB as given below:
/* 1 */
{
"_id" : ObjectId("58736c7f7d43c305461cdb9b"),
"Name" : "Kevin",
"pb_event" : [
{
"event_type" : "Birthday",
"event_date" : "2014-08-31"
},
{
"event_type" : "Anniversary",
"event_date" : "2014-08-31"
}
]
}
/* 2 */
{
"_id" : ObjectId("58736cfc7d43c305461cdba8"),
"Name" : "Peter",
"pb_event" : [
{
"event_type" : "Birthday",
"event_date" : "2014-08-31"
},
{
"event_type" : "Anniversary",
"event_date" : "2015-03-24"
}
]
}
/* 3 */
{
"_id" : ObjectId("58736cfc7d43c305461cdba9"),
"Name" : "Pole",
"pb_event" : [
{
"event_type" : "Birthday",
"event_date" : "2015-03-24"
},
{
"event_type" : "Work Anniversary",
"event_date" : "2015-03-24"
}
]
}
现在,我希望在event_date
上具有分组的结果,然后在event_type
上具有分组的结果. event_type
包含相关用户的所有名称,然后包含各个数组中的记录数.
Now I want the result that has group on event_date
then after group on event_type
. event_type
contain all names of the related user, then count of records in the respective array.
预期产量
/* 1 */
{
"event_date" : "2014-08-31",
"data" : [
{
"event_type" : "Birthday",
"details" : [
{
"_id" : ObjectId("58736c7f7d43c305461cdb9b"),
"name" : "Kevin"
},
{
"_id" : ObjectId("58736cfc7d43c305461cdba8"),
"name" : "Peter"
}
],
"count" : 2
},
{
"event_type" : "Anniversary",
"details" : [
{
"_id" : ObjectId("58736c7f7d43c305461cdb9b"),
"name" : "Kevin"
}
],
"count" : 1
}
]
}
/* 2 */
{
"event_date" : "2015-03-24",
"data" : [
{
"event_type" : "Anniversary",
"details" : [
{
"_id" : ObjectId("58736cfc7d43c305461cdba8"),
"name" : "Peter"
}
],
"count" : 1
},
{
"event_type" : "Birthday",
"details" : [
{
"_id" : ObjectId("58736cfc7d43c305461cdba9"),
"name" : "Pole"
}
],
"count" : 1
},
{
"event_type" : "Work Anniversary",
"details" : [
{
"_id" : ObjectId("58736cfc7d43c305461cdba9"),
"name" : "Pole"
}
],
"count" : 1
}
]
}
推荐答案
使用聚合框架,您需要运行具有以下阶段的管道,以便获得所需的结果:
Using the aggregation framework, you would need to run a pipeline that has the following stages so that you get the desired result:
db.collection.aggregate([
{ "$unwind": "$pb_event" },
{
"$group": {
"_id": {
"event_date": "$pb_event.event_date",
"event_type": "$pb_event.event_type"
},
"details": {
"$push": {
"_id": "$_id",
"name": "$Name"
}
},
"count": { "$sum": 1 }
}
},
{
"$group": {
"_id": "$_id.event_date",
"data": {
"$push": {
"event_type": "$_id.event_type",
"details": "$details",
"count": "$count"
}
}
}
},
{
"$project": {
"_id": 0,
"event_date": "$_id",
"data": 1
}
}
])
In the above pipeline, the first step is the $unwind
operator
{ "$unwind": "$pb_event" }
在将数据存储为数组时非常方便.将展开运算符应用于列表数据字段时,它将为应用展开的列表数据字段的每个元素生成一条新记录.基本上可以使数据变平整.
which comes in quite handy when the data is stored as an array. When the unwind operator is applied on a list data field, it will generate a new record for each and every element of the list data field on which unwind is applied. It basically flattens the data.
This is a necessary operation for the next pipeline stage, the $group
step where you group the flattened documents by the deconstructed pb_event
array fields event_date
and event_type
:
{
"$group": {
"_id": {
"event_date": "$pb_event.event_date",
"event_type": "$pb_event.event_type"
},
"details": {
"$push": {
"_id": "$_id",
"name": "$Name"
}
},
"count": { "$sum": 1 }
}
},
$group
管道运算符类似于SQL的GROUP BY
子句.在SQL中,除非使用任何聚合函数,否则不能使用GROUP BY
.同样,您还必须在MongoDB中使用聚合函数(称为累加器运算符).您可以在此处了解更多信息
The $group
pipeline operator is similar to the SQL's GROUP BY
clause. In SQL, you can't use GROUP BY
unless you use any of the aggregation functions. The same way, you have to use an aggregation function in MongoDB (called an accumulator operator) as well. You can read more about the aggregation functions here.
在此 $group
操作,即使用 $push
运算符,该运算符返回每个组的表达式值数组.
In this $group
operation, the logic to calculate the count aggregate i.e. the total number of documents in the group using the $sum
accumulator operator. Within the same pipeline, you can aggregate a list of the name
and _id
subdocuments by using the $push
operator which returns an array of expression values for each group.
之前的 $group
管道
{
"$group": {
"_id": "$_id.event_date",
"data": {
"$push": {
"event_type": "$_id.event_type",
"details": "$details",
"count": "$count"
}
}
}
}
将通过对event_date
进行分组来进一步汇总来自最后一个管道的结果,这可以通过使用 $project
管道阶段
will further aggregate the results from the last pipeline by grouping on the event_date
, which forms basis of the desired output by creating a new data list using $push
and then the final $project
pipeline stage
{
"$project": {
"_id": 0,
"event_date": "$_id",
"data": 1
}
}
通过将_id
字段重命名为event_date
并保留另一个字段来重塑文档字段.
reshapes the documents fields by renaming the _id
field to event_date
and retaining the other field.
这篇关于数组内的多个嵌套组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!