汇总后如何取回原始文档 [英] How to get back the Original document back after aggregation

查看:93
本文介绍了汇总后如何取回原始文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在某些情况下,我想查询在表单"数组字段下具有多个项目的文档集合.要解决的问题是只想返回个文档,这些文档包含表单"中所有文档的所有,并且其状态为已关闭".

I have a case where I want to query a collection of documents that have a number of items under a array field "forms". The problem to solve was wanting to return only the documents that have all of the documents contained in "forms" with a particular status of "closed".

这是集合中两个不同文档的示例:

So here is a sample of two different documents in the collection:

{
    "_id" : "Tvq444454j",
    "name" : "Jim",
    "forms" : [
        {
            "name" : "Jorney",
            "status" : "closed"
        },
        {
            "name" : "Women",
            "status" : "void"
        },
        {
            "name" : "Child",
            "status" : "closed"
        },
        {
            "name" : "Farm",
            "status" : "closed"
        }
    ]
},

{
    "_id" : "Tvq579754r",
    "name" : "Tom",
    "forms" : [
        {
            "name" : "PreOp",
            "status" : "closed"
        },
        {
            "name" : "Alert",
            "status" : "closed"
        },
        {
            "name" : "City",
            "status" : "closed"
        },
        {
            "name" : "Country",
            "status" : "closed"
        }
    ]
}

预期结果:

{
    "_id" : "Tvq579754r",
    "name" : "Tom",
    "forms" : [
        {
            "name" : "PreOp",
            "status" : "closed"
        },
        {
            "name" : "Alert",
            "status" : "closed"
        },
        {
            "name" : "City",
            "status" : "closed"
        },
        {
            "name" : "Country",
            "status" : "closed"
        }
    ]
}

由于在这种情况下没有标准的查询运算符可以匹配数组的所有元素,因此可以通过聚合找到解决方案.这将返回集合中所有表单"元素都设置为已关闭"状态的文档的_id.

As there is no standard query operator to match all of the elements of the array under this condition, the solution was found by using aggregation. This would return the _id of the documents in the collection that have all of their "forms" elements set to the status of "closed".

db.forms.aggregate([
    {$unwind: "$forms" },
    {$group: { _id: "$_id", status: {$addToSet: "$forms.status" }}},
    {$unwind: "$status"},
    {$sort: { _id: 1, status: -1 }},
    {$group: {_id: "$_id", status: {$first: "$status"}}},
    {$match:{ status: "closed" }}
])

因此,正如我期望在结果中返回许多文档一样,我希望避免发布另一个查找或一系列查找,只是为了获得与返回的_id匹配的文档.

So as I would be expecting to return many documents in the results, I would like to avoid issuing another find, or series of finds just to get the documents that match the returned _id's.

考虑到这一点,我有什么办法可以从聚合中以与集合中完全相同的形式从聚合中取回原始文档,同时仍然进行这种类型的过滤?

Considering this, is there any way that I can get the original documents back from aggregation in exactly the same form as they are in the collection, while still doing this type of filtering?

推荐答案

愚蠢的聚合技巧类别下摔倒是一种经常被忽视的小技巧.

Falling under the category of stupid aggregation tricks is a little technique that often gets overlooked.

所有查询都围绕文档_id进行分组,该查询是该文档的唯一标识符.因此要考虑的重点是整个文档实际上已经是一个唯一的标识符.因此,不仅要保存_id键,还要使用整个文档.

The query doing all of it's grouping around the document _id, being the unique identifier for this document. So the main point to think of is the whole document is actually a unique identifier already. So instead of just stashing in the _id key, use the whole document.

    {$project: { 
        _id: { _id: "$_id", name: "$name", forms: "$forms" }, forms: "$forms"}
    },

执行此操作时,_id汇总的所有内容都会以原始形式保留文档.在所有其他汇总阶段结束时,发出最终的 $ project 以便还原真实的原始文档格式:

Where this is done anything that is rolled up by the _id retains the document in it's original form. At the end of all other aggregation stages, issue a final $project in order to restore the true original document form:

    {$project: { _id: "$_id._id", name: "$_id.name", forms: "$_id.forms"}}

然后,您将获得所需的过滤结果.与高级过滤(例如在此查询中)一起使用时,该技术非常方便,因为它消除了对所有结果发出额外的查找的需求.

Then you will have the filtered results that you want. This technique can be very handy when used with advanced filtering such as in the case of this query, as it removes the need to issue an additional find on all of the results.

此外,在这种情况下,如果您知道您只是在寻找要与一组特定条件匹配的结果,请使用

Also, in such a case where you know you are only looking for a set of results that are going to match a certain set of conditions, use a $match operator as the first stage of the aggregation pipeline. This is not only useful in reducing the working set size, but it is also the only stage at which you can make use of an index and where you can significantly increase query performance.

整个过程在一起:

db.forms.aggregate([
    {$match: { "forms.status": "closed" } },
    {$project: { 
        _id: { _id: "$_id", name: "$name", forms: "$forms" }, forms: "$forms"}
    },
    {$unwind: "$forms"},
    {$group: { _id: "$_id", status: {$addToSet: "$forms.status"}}},
    {$unwind: "$status"},
    {$sort: { _id: 1, status: -1} },
    {$group: { _id: "$_id", status: {$first: "$status"} }},
    {$match: { status: "closed"}},
    {$project: { _id: "$_id._id", name: "$_id.name", forms: "$_id.forms"}}
])

这篇关于汇总后如何取回原始文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆