聚合后如何取回原始文档 [英] How to get back the Original document back after aggregation

查看:25
本文介绍了聚合后如何取回原始文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个案例,我想查询在数组字段表单"下具有多个项目的文档集合.要解决的问题是希望返回具有所有包含在表单"中且特定状态为关闭"的文档的文档.

I have a case where I want to query a collection of documents that have a number of items under a array field "forms". The problem to solve was wanting to return only the documents that have all of the documents contained in "forms" with a particular status of "closed".

所以这是集合中两个不同文档的示例:

So here is a sample of two different documents in the collection:

{
    "_id" : "Tvq444454j",
    "name" : "Jim",
    "forms" : [
        {
            "name" : "Jorney",
            "status" : "closed"
        },
        {
            "name" : "Women",
            "status" : "void"
        },
        {
            "name" : "Child",
            "status" : "closed"
        },
        {
            "name" : "Farm",
            "status" : "closed"
        }
    ]
},

{
    "_id" : "Tvq579754r",
    "name" : "Tom",
    "forms" : [
        {
            "name" : "PreOp",
            "status" : "closed"
        },
        {
            "name" : "Alert",
            "status" : "closed"
        },
        {
            "name" : "City",
            "status" : "closed"
        },
        {
            "name" : "Country",
            "status" : "closed"
        }
    ]
}

以及预期的结果:

{
    "_id" : "Tvq579754r",
    "name" : "Tom",
    "forms" : [
        {
            "name" : "PreOp",
            "status" : "closed"
        },
        {
            "name" : "Alert",
            "status" : "closed"
        },
        {
            "name" : "City",
            "status" : "closed"
        },
        {
            "name" : "Country",
            "status" : "closed"
        }
    ]
}

由于在这种情况下没有标准的查询运算符来匹配数组的所有元素,因此通过使用聚合找到了解决方案.这将返回集合中所有表单"元素都设置为关闭"状态的文档的 _id.

As there is no standard query operator to match all of the elements of the array under this condition, the solution was found by using aggregation. This would return the _id of the documents in the collection that have all of their "forms" elements set to the status of "closed".

db.forms.aggregate([
    {$unwind: "$forms" },
    {$group: { _id: "$_id", status: {$addToSet: "$forms.status" }}},
    {$unwind: "$status"},
    {$sort: { _id: 1, status: -1 }},
    {$group: {_id: "$_id", status: {$first: "$status"}}},
    {$match:{ status: "closed" }}
])

因此,由于我希望在结果中返回许多文档,因此我希望避免发出另一个查找或一系列查找,只是为了获取与返回的 _id 匹配的文档.

So as I would be expecting to return many documents in the results, I would like to avoid issuing another find, or series of finds just to get the documents that match the returned _id's.

考虑到这一点,有什么方法可以让我以与集合中完全相同的形式从聚合中取回原始文档,同时仍然进行这种类型的过滤?

Considering this, is there any way that I can get the original documents back from aggregation in exactly the same form as they are in the collection, while still doing this type of filtering?

推荐答案

属于愚蠢的聚合技巧是一个经常被忽视的小技巧.

Falling under the category of stupid aggregation tricks is a little technique that often gets overlooked.

执行所有操作的查询围绕文档 _id 进行分组,这是该文档的唯一标识符.所以要考虑的要点是整个文档实际上已经是一个唯一标识符.因此,不要仅仅隐藏在 _id 键中,而是使用整个文档.

The query doing all of it's grouping around the document _id, being the unique identifier for this document. So the main point to think of is the whole document is actually a unique identifier already. So instead of just stashing in the _id key, use the whole document.

    {$project: { 
        _id: { _id: "$_id", name: "$name", forms: "$forms" }, forms: "$forms"}
    },

在此过程中,由 _id 汇总的任何内容都会保留文档的原始形式.在所有其他聚合阶段结束时,发出最终 $project 以还原真实的原始文档形式:

Where this is done anything that is rolled up by the _id retains the document in it's original form. At the end of all other aggregation stages, issue a final $project in order to restore the true original document form:

    {$project: { _id: "$_id._id", name: "$_id.name", forms: "$_id.forms"}}

然后你就会得到你想要的过滤结果.这种技术在与高级过滤一起使用时非常方便,例如在此查询的情况下,因为它不需要对所有结果发出额外的查找.

Then you will have the filtered results that you want. This technique can be very handy when used with advanced filtering such as in the case of this query, as it removes the need to issue an additional find on all of the results.

此外,在这种情况下,您知道您只是在寻找符合特定条件集的一组结果,请使用 $match 运算符作为聚合管道的第一阶段.这不仅对减少工作集大小很有用,而且也是唯一阶段,您可以在此阶段使用索引,并且可以显着提高查询性能.

Also, in such a case where you know you are only looking for a set of results that are going to match a certain set of conditions, use a $match operator as the first stage of the aggregation pipeline. This is not only useful in reducing the working set size, but it is also the only stage at which you can make use of an index and where you can significantly increase query performance.

整个过程:

db.forms.aggregate([
    {$match: { "forms.status": "closed" } },
    {$project: { 
        _id: { _id: "$_id", name: "$name", forms: "$forms" }, forms: "$forms"}
    },
    {$unwind: "$forms"},
    {$group: { _id: "$_id", status: {$addToSet: "$forms.status"}}},
    {$unwind: "$status"},
    {$sort: { _id: 1, status: -1} },
    {$group: { _id: "$_id", status: {$first: "$status"} }},
    {$match: { status: "closed"}},
    {$project: { _id: "$_id._id", name: "$_id.name", forms: "$_id.forms"}}
])

这篇关于聚合后如何取回原始文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆