MongoDB查找子文档并对结果进行排序 [英] MongoDB find subdocument and sort the results
问题描述
我在MongoDB中有一个具有复杂结构和子文档的集合.该文档的结构如下:
I have a collection in MongoDB with a complex structure and subdocuments. The document have an structure like this:
doc1 = {
'_id': '12345678',
'url': "http//myurl/...",
'nlp':{
"status": "OK",
"entities": {
"0": {
"type" : "Person",
"relevance": "0.877245",
"text" : "Neelie Kroes"
},
"1": {
"type": "Company",
"relevance": "0.36242",
"text": "ICANN"
},
"2": {
"type": "Company",
"relevance": "0.265175",
"text": "IANA"
}
}
}
}
doc2 = {
'_id': '987456321',
'url': "http//myurl2/...",
'nlp':{
"status": "OK",
"entities": {
"0": {
"type": "Company",
"relevance": "0.96",
"text": "ICANN"
},
"1": {
"type" : "Person",
"relevance": "0.36242",
"text" : "Neelie Kroes"
},
"2": {
"type": "Company",
"relevance": "0.265175",
"text": "IANA"
}
}
}
}
我的任务是在子文档中搜索类型"和文本",然后按相关性"排序.使用$ elemMatch运算符,我可以执行查询:
My task is to search for "type" AND "text" inside the subdocument, then sort by "relevance". With the $elemMatch operator I'm able to perform the query:
db.resource.find({
'nlp.entities': {
'$elemMatch': {'text': 'Neelie Kroes', 'type': 'Person'}
}
});
完美,现在我必须按相关性降序对所有类型为"Person"和值为"Neelie Kroes"的实体的记录进行排序.
Perfect, now I have to sort all the records with entities of type "Person" and value "Neelie Kroes" by relevance descending.
I tried with a normal "sort", but, as the manual said about the sort() in $elemMatch, the result may not reflect the sort order because the sort() was applied to the elements of the array before the $elemMatch projection.
实际上,_id:987456321将是第一个(相关性为0.96,但引用ICANN).
In fact, the _id:987456321 will be the first (with a relevance of 0.96, but referenced to ICANN).
我该怎么做,以匹配的子文档的相关性对我的文档进行排序?
How can I do, to sort my documents by matched subdocument's relevance?
P.S .:我无法更改文档结构.
P.S.: I can't change the document structure.
推荐答案
如前所述,我希望您的文档确实有一个数组,但是如果$ elemMatch为您工作,那么它们应该会.
As noted I hope your documents actually do have an array, but if $elemMatch is working for you then they should.
无论如何,您不能使用find按数组中的元素排序.但是在某些情况下,您可以使用 .aggregate()
:
At any rate, you cannot sort by an element in an array using find. But there is a case where you can do this using .aggregate()
:
db.collection.aggregate([
// Match the documents that you want, containing the array
{ "$match": {
"nlp.entities": {
"$elemMatch": {
"text": "Neelie Kroes",
"type": "Person"
}
}
}},
// Project to "store" the whole document for later, duplicating the array
{ "$project": {
"_id": {
"_id": "$_id",
"url": "$url",
"nlp": "$nlp"
},
"entities": "$nlp.entities"
}},
// Unwind the array to de-normalize
{ "$unwind": "$entities" },
// Match "only" the relevant entities
{ "$match": {
"entities.text": "Neelie Kroes",
"entities.type": "Person"
}},
// Sort on the relevance
{ "$sort": { "entities.relevance": -1 } },
// Restore the original document form
{ "$project": {
"_id": "$_id._id",
"url": "$_id.url",
"nlp": "$_id.nlp"
}}
])
因此,基本上,在执行 $ match
条件(包含相关匹配项的文档),然后使用 $ project
将原始文档存储"在 _id
字段和
So essentially, after doing the $match
condition for documents that contained the relevant match, you then use $project
"store" the original document in the _id
field and $unwind
a "copy" of the "entities" array.
下一个 $ match
仅将数组内容过滤"为相关的内容.然后,您应用 $ sort
到匹配的"文档.
The next $match
"filters" the array contents to only those ones that are relevant. Then you apply the $sort
to the "matched" documents.
由于原始"文档存储在 _id
下,因此您使用 $ project
来恢复"文档实际必须开始的结构.
As the "original" document was stored under _id
, you use $project
to "restore" the structure that the document actually had to begin with.
这就是您如何对数组的匹配元素进行排序".
That is how you "sort" on your matched element of an array.
请注意,如果您在父文档的数组中有多个匹配项",那么您将不得不使用其他
Note that if you had multiple "matches" within an array for a parent document, then you would have to employ an additional $group
stage to get the $max value for the "relevance" field in order to complete your sort.
这篇关于MongoDB查找子文档并对结果进行排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!