Mongodb按条件对所有对象中的所有数组元素进行计数 [英] Mongodb count all array elements in all objects matching by criteria

查看:115
本文介绍了Mongodb按条件对所有对象中的所有数组元素进行计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个集合,它是关于这样的对象的活动日志:

I have a collection that is log of activity on objects like this:

{
    "_id" : ObjectId("55e3fd1d7cb5ac9a458b4567"),
    "object_id" : "1",
    "activity" : [ 
        {
            "action" : "test_action",
            "time" : ISODate("2015-08-31T00:00:00.000Z")
        },
        {
            "action" : "test_action",
            "time" : ISODate("2015-08-31T00:00:22.000Z")
        }
    ]
}

{
    "_id" : ObjectId("55e3fd127cb5ac77478b4567"),
    "object_id" : "2",
    "activity" : [ 
        {
            "action" : "test_action",
            "time" : ISODate("2015-08-31T00:00:00.000Z")
        }
    ]
}

{
    "_id" : ObjectId("55e3fd0f7cb5ac9f458b4567"),
    "object_id" : "1",
    "activity" : [ 
        {
            "action" : "test_action",
            "time" : ISODate("2015-08-30T00:00:00.000Z")
        }
    ]
}

如果我跟踪查询:

db.objects.find({
    "createddate": {$gte : ISODate("2015-08-30T00:00:00.000Z")},
    "activity.action" : "test_action"}
    }).count()

它返回包含"test_action"(在此集合中为3)的文档计数,但我需要获取所有test_actions(在此集合中为4)的计数.我该怎么办?

it returns count of documents containing "test_action" (3 in this set), but i need to get count of all test_actions (4 on this set). How do i do that?

推荐答案

最高效"的方法是跳过 $unwind 完全简单地 $group 进行计数.本质上,过滤器"数组获取 $size $sum :

The most "performant" way to do this is to skip the $unwind altogther and simply $group to count. Essentially "filter" arrays get the $size of the results to $sum:

db.objects.aggregate([
    { "$match": {
        "createddate": {
            "$gte": ISODate("2015-08-30T00:00:00.000Z")
        },
        "activity.action": "test_action"
    }},
    { "$group": {
        "_id": null,
        "count": {
            "$sum": {
                "$size": {
                    "$setDifference": [
                        { "$map": {
                            "input": "$activity",
                            "as": "el",
                            "in": {
                                "$cond": [ 
                                    { "$eq": [ "$$el.action", "test_action" ] },
                                    "$$el",
                                    false
                                ]
                            }               
                        }},
                        [false]
                    ]
                }
            }
        }
    }}
])

MongoDB的未来版本将具有$filter,这使此过程变得更加简单:

Future releases of MongoDB will have $filter, which makes this much more simple:

db.objects.aggregate([
    { "$match": {
        "createddate": {
            "$gte": ISODate("2015-08-30T00:00:00.000Z")
        },
        "activity.action": "test_action"
    }},
    { "$group": {
        "_id": null,
        "count": {
            "$sum": {
                "$size": {
                    "$filter": {
                        "input": "$activity",
                        "as": "el",
                        "cond": {
                            "$eq": [ "$$el.action", "test_action" ]
                        }
                    }
                }
            }
        }
    }}
])

使用$unwind会导致文档反规范化并有效地为每个数组条目创建一个副本.由于可能经常需要付出极高的成本,因此应尽可能避免这种情况.相比之下,每个文档的过滤和计数数组条目要快得多.与许多阶段相比,这是一个简单的$match$group管道.

Using $unwind causes the documents to de-normalize and effectively creates a copy per array entry. Where possible you should avoid this due the the often extreme cost. Filtering and counting array entries per document is much faster by comparison. As is a simple $match and $group pipeline compared to many stages.

这篇关于Mongodb按条件对所有对象中的所有数组元素进行计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆