在mongodb中查找具有字段最大值的不同文档 [英] Find distinct documents with max value of a field in mongodb
问题描述
我在MongoDB中有成千上万的文档,其中的一些示例如下:
I have thousands of documents in MongoDB with some of the sample as below:
{"title":"Foo", "hash": "1234567890abcedf", "num_sold": 49,
"created": "2013-03-09 00:00:00"}
{"title":"Bar", "hash": "1234567890abcedf", "num_sold": 55,
"created": "2013-03-11 00:00:00"}
{"title":"Baz", "hash": "1234567890abcedf", "num_sold": 55,
"created": "2013-03-10 00:00:00"}
{"title":"Spam", "hash": "abcedef1234567890", "num_sold": 20,
"created": "2013-03-11 00:00:00"}
{"title":"Eggs", "hash": "abc1234567890def", "num_sold": 20,
"created": "2013-03-11 00:00:00"}
是否可以选择所有具有最大num_sold
的不同hash
的文档,如果有多个具有相同num_sold
的文档,请从created
字段中选择最新的文档.
Is it possible to select all documents with distinct hash
which has the max of num_sold
and if there is more than one document with same num_sold
, select the latest document from the created
field.
我将PyMongo用于客户端.
I use PyMongo for the client.
推荐答案
我不是Python专家,所以我将用JavaScript编写.您可以使用$sort
,$group
和$first
操作符,通过聚合框架来做到这一点:
I am no Python expert so I will write this in JavaScript. You can do this with the aggregation framework using the $sort
, $group
and $first
opreators:
db.col.aggregate([
{$sort: {created:-1}},
{$group: {_id: '$hash', num_sold: {$first: '$num_sold'}, _id_seen: {$first: '$_id'}}}
])
基本上,我要做的是按照创建的日期DESC对传入的文档进行排序,然后对哈希进行分组,将两个重复的散列连接起来,然后得到排序后的组的第一个结果,该结果应该是最新的文档.
Essentially what I do is sort the incoming documents by their created date DESC and then I group on hash, concatenating two duplicate hashes and then I get the first result of the sorted group, which should be the newest document.
参考文献:
这篇关于在mongodb中查找具有字段最大值的不同文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!