在查询中排序数组并投影所有字段 [英] sort array in query and project all fields

查看:119
本文介绍了在查询中排序数组并投影所有字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在查询时对嵌套数组进行排序,同时还要投影文档中的所有字段.

I would like to sort a nested array at query time while also projecting all fields in the document.

示例文档:

{ "_id" : 0, "unknown_field" : "foo", "array_to_sort" : [ { "a" : 3, "b" : 4 }, { "a" : 3, "b" : 3 }, { "a" : 1, "b" : 0 } ] }

我可以使用聚合来执行排序,但是我无法保留我需要的所有字段.该应用程序在查询时不知道每个文档中还会出现哪些其他字段,因此我无法显式地对其进行投影.如果我有一个通配符来投影所有字段,那么它将起作用:

I can perform the sorting with an aggregation but I cannot preserve all the fields I need. The application does not know at query time what other fields may appear in each document, so I am not able to explicitly project them. If I had a wildcard to project all fields then this would work:

db.c.aggregate([
    {$unwind: "$array_to_sort"},
    {$sort: {"array_to_sort.b":1, "array_to_sort:a": 1}},
    {$group: {_id:"$_id", array_to_sort: {$push:"$array_to_sort"}}}
]);

...但是不幸的是,它产生的结果不包含"unknown_field":

...but unfortunately, it produces a result that does not contain the "unknown_field":

    {
        "_id" : 0,
        "array_to_sort" : [
            {
                "a" : 1,
                "b" : 0
            },
            {
                "a" : 3,
                "b" : 3
            },
            {
                "a" : 3,
                "b" : 4
            }
        ]
    }

这是您要尝试的插入命令:

Here is the insert command incase you would like to experiment:

db.c.insert({"unknown_field": "foo", "array_to_sort": [{"a": 3, "b": 4}, {"a": 3, "b":3}, {"a": 1, "b":0}]})

我无法对数组进行预排序,因为排序条件是动态的.我可能在查询时按a和/或b升/降的任意组合进行排序.我意识到我可能需要在客户端应用程序中执行此操作,但是如果我可以在mongo中执行此操作将很不错,因为这样我还可以$ slice/skip/limit限制分页的结果,而不是每次都检索整个数组. /p>

I cannot pre-sort the array because the sort criteria is dynamic. I may be sorting by any combination of a and/or b ascending/descending at query time. I realize I may need to do this in my client application, but it would be sweet if I could do it in mongo because then I could also $slice/skip/limit the results for paging instead of retrieving the entire array every time.

推荐答案

由于您在文档_id上进行分组,因此只需将要保留的字段放在分组_id中即可.然后,您可以使用 $project

Since you are grouping on the document _id you can simply place the fields you wish to keep within the grouping _id. Then you can re-form using $project

db.c.aggregate([
    { "$unwind": "$array_to_sort"},
    { "$sort": {"array_to_sort.b":1, "array_to_sort:a": 1}},
    { "$group": { 
        "_id": {
            "_id": "$_id",
            "unknown_field": "$unknown_field"
        },
        "Oarray_to_sort": { "$push":"$array_to_sort"}
    }},
    { "$project": {
        "_id": "$_id._id",
        "unknown_field": "$_id.unknown_field",
        "array_to_sort": "$Oarray_to_sort"
    }}
]);

其中的另一个技巧"在分组阶段为阵列使用了临时名称.当您 $project 并更改名称,您将按照projection语句中指定的顺序获取字段.如果您没有这样做,那么"array_to_sort"字段将不是顺序中的最后一个字段,因为它是从上一阶段复制过来的.

The other "trick" in there is using a temporary name for the array in the grouping stage. This is so when you $project and change the name, you get the fields in the order specified in the projection statement. If you did not, then the "array_to_sort" field would not be the last field in the order, as it is copied from the prior stage.

这是 $project 中的预期优化> ,但是如果您要订购,可以按照上面的步骤进行.

That is an intended optimization in $project, but if you want the order then you can do it as above.

对于完全未知的结构,有mapReduce的处理方式:

For completely unknown structures there is the mapReduce way of doing things:

db.c.mapReduce(
    function () {
        this["array_to_sort"].sort(function(a,b) {
            return a.a - b.a || a.b - b.b;
        });

        emit( this._id, this );
    },
    function(){},
    { "out": { "inline": 1 } }
)

当然,其输出格式特定于mapReduce,因此与您拥有的文档不完全相同,但是所有字段都包含在值"下:

Of course that has an output format that is specific to mapReduce and therefore not exactly the document you had, but all the fields are contained under "values":

{
    "results" : [
            {
                    "_id" : 0,
                    "value" : {
                            "_id" : 0,
                            "some_field" : "a",
                            "array_to_sort" : [
                                    {
                                            "a" : 1,
                                            "b" : 0
                                    },
                                    {
                                            "a" : 3,
                                            "b" : 3
                                    },
                                    {
                                            "a" : 3,
                                            "b" : 4
                                    }
                            ]
                    }
            }
    ],
}

将来的发行版(撰写本文时)使您可以总计使用$$ROOT变量来表示文档:

Future releases ( as of writing ) allow you to use a $$ROOT variable in aggregate to represent the document:

db.c.aggregate([
    { "$project": {
        "_id": "$$ROOT",
        "array_to_sort": "$array_to_sort"
    }},
    { "$unwind": "$array_to_sort"},
    { "$sort": {"array_to_sort.b":1, "array_to_sort:a": 1}},
    { "$group": { 
        "_id": "$_id",
        "array_to_sort": { "$push":"$array_to_sort"}
    }}
]);

因此,使用最后的项目"阶段毫无意义,因为您实际上并不了解文档中的其他字段.但是它们都将包含在结果文档的_id字段中(包括原始数组和order).

So there is no point there using the final "project" stage as you do not actually know the other fields in the document. But they will all be contained (including the original array and order ) within the _id field of the result document.

这篇关于在查询中排序数组并投影所有字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆