MongoDB查找子文档并对结果进行排序 [英] MongoDB find subdocument and sort the results

查看:81
本文介绍了MongoDB查找子文档并对结果进行排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在MongoDB中有一个具有复杂结构和子文档的集合.该文档的结构如下:

I have a collection in MongoDB with a complex structure and subdocuments. The document have an structure like this:

doc1 = {
    '_id': '12345678',
    'url': "http//myurl/...",
    'nlp':{
        "status": "OK",
        "entities": {
            "0": {
                "type" : "Person",
                "relevance": "0.877245",
                "text" : "Neelie Kroes"
            },
            "1": {
                "type": "Company",
                "relevance": "0.36242",
                "text": "ICANN"
            },
            "2": {
                "type": "Company",
                "relevance": "0.265175",
                "text": "IANA" 
            }
        }
    }
}


doc2 = {
    '_id': '987456321',
    'url': "http//myurl2/...",
    'nlp':{
        "status": "OK",
        "entities": {
            "0": {
                "type": "Company",
                "relevance": "0.96",
                "text": "ICANN"
            },
            "1": {
                "type" : "Person",
                "relevance": "0.36242",
                "text" : "Neelie Kroes"
            },
            "2": {
                "type": "Company",
                "relevance": "0.265175",
                "text": "IANA" 
            }
        }
    }
}

我的任务是在子文档中搜索类型"和文本",然后按相关性"排序.使用$ elemMatch运算符,我可以执行查询:

My task is to search for "type" AND "text" inside the subdocument, then sort by "relevance". With the $elemMatch operator I'm able to perform the query:

db.resource.find({
    'nlp.entities': {
        '$elemMatch': {'text': 'Neelie Kroes', 'type': 'Person'}
    }
});

完美,现在我必须按相关性降序对所有类型为"Person"和值为"Neelie Kroes"的实体的记录进行排序.

Perfect, now I have to sort all the records with entities of type "Person" and value "Neelie Kroes" by relevance descending.

我尝试使用普通的排序",但是,作为

I tried with a normal "sort", but, as the manual said about the sort() in $elemMatch, the result may not reflect the sort order because the sort() was applied to the elements of the array before the $elemMatch projection.

实际上,_id:987456321将是第一个(相关性为0.96,但引用ICANN).

In fact, the _id:987456321 will be the first (with a relevance of 0.96, but referenced to ICANN).

我该怎么做,以匹配的子文档的相关性对我的文档进行排序?

How can I do, to sort my documents by matched subdocument's relevance?

P.S .:我无法更改文档结构.

P.S.: I can't change the document structure.

推荐答案

如前所述,我希望您的文档确实有一个数组,但是如果$ elemMatch为您工作,那么它们应该会.

As noted I hope your documents actually do have an array, but if $elemMatch is working for you then they should.

无论如何,您不能使用find按数组中的元素排序.但是在某些情况下,您可以使用 .aggregate() :

At any rate, you cannot sort by an element in an array using find. But there is a case where you can do this using .aggregate():

db.collection.aggregate([

    // Match the documents that you want, containing the array
    { "$match": {
        "nlp.entities": {
            "$elemMatch": { 
                "text": "Neelie Kroes", 
                "type": "Person"   
            }
        }
    }},

    // Project to "store" the whole document for later, duplicating the array
    { "$project": {
        "_id": {
            "_id": "$_id",
            "url": "$url",
            "nlp": "$nlp"          
        },
        "entities": "$nlp.entities"
    }},

    // Unwind the array to de-normalize
    { "$unwind": "$entities" },

    // Match "only" the relevant entities
    { "$match": {
        "entities.text": "Neelie Kroes", 
        "entities.type": "Person"   
    }},

    // Sort on the relevance
    { "$sort": { "entities.relevance": -1 } },

    // Restore the original document form
    { "$project": {
        "_id": "$_id._id",
        "url": "$_id.url",
        "nlp": "$_id.nlp"
    }}
])

因此,基本上,在执行 $ match 条件(包含相关匹配项的文档),然后使用 $ project 将原始文档存储"在 _id 字段和

So essentially, after doing the $match condition for documents that contained the relevant match, you then use $project "store" the original document in the _id field and $unwind a "copy" of the "entities" array.

下一个 $ match 仅将数组内容过滤"为相关的内容.然后,您应用 $ sort 到匹配的"文档.

The next $match "filters" the array contents to only those ones that are relevant. Then you apply the $sort to the "matched" documents.

由于原始"文档存储在 _id 下,因此您使用 $ project 来恢复"文档实际必须开始的结构.

As the "original" document was stored under _id, you use $project to "restore" the structure that the document actually had to begin with.

这就是您如何对数组的匹配元素进行排序".

That is how you "sort" on your matched element of an array.

请注意,如果您在父文档的数组中有多个匹配项",那么您将不得不使用其他

Note that if you had multiple "matches" within an array for a parent document, then you would have to employ an additional $group stage to get the $max value for the "relevance" field in order to complete your sort.

这篇关于MongoDB查找子文档并对结果进行排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆