在mongodb中查找具有字段最大值的不同文档 [英] Find distinct documents with max value of a field in mongodb

查看:151
本文介绍了在mongodb中查找具有字段最大值的不同文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在MongoDB中有成千上万的文档,其中的一些示例如下:

I have thousands of documents in MongoDB with some of the sample as below:

{"title":"Foo", "hash": "1234567890abcedf", "num_sold": 49, 
"created": "2013-03-09 00:00:00"}

{"title":"Bar", "hash": "1234567890abcedf", "num_sold": 55, 
"created": "2013-03-11 00:00:00"}

{"title":"Baz", "hash": "1234567890abcedf", "num_sold": 55,
 "created": "2013-03-10 00:00:00"}

{"title":"Spam", "hash": "abcedef1234567890", "num_sold": 20,
 "created": "2013-03-11 00:00:00"}

{"title":"Eggs", "hash": "abc1234567890def", "num_sold": 20,
 "created": "2013-03-11 00:00:00"}

是否可以选择所有具有最大num_sold的不同hash的文档,如果有多个具有相同num_sold的文档,请从created字段中选择最新的文档.

Is it possible to select all documents with distinct hash which has the max of num_sold and if there is more than one document with same num_sold, select the latest document from the created field.

我将PyMongo用于客户端.

I use PyMongo for the client.

推荐答案

我不是Python专家,所以我将用JavaScript编写.您可以使用$sort$group$first操作符,通过聚合框架来做到这一点:

I am no Python expert so I will write this in JavaScript. You can do this with the aggregation framework using the $sort, $group and $first opreators:

db.col.aggregate([
    {$sort: {created:-1}},
    {$group: {_id: '$hash', num_sold: {$first: '$num_sold'}, _id_seen: {$first: '$_id'}}}
])

基本上,我要做的是按照创建的日期DESC对传入的文档进行排序,然后对哈希进行分组,将两个重复的散列连接起来,然后得到排序后的组的第一个结果,该结果应该是最新的文档.

Essentially what I do is sort the incoming documents by their created date DESC and then I group on hash, concatenating two duplicate hashes and then I get the first result of the sorted group, which should be the newest document.

参考文献:

这篇关于在mongodb中查找具有字段最大值的不同文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆