根据Mongodb查询中的一个字段查找最大连续记录数 [英] Find count of maximum consecutive records based on one field in Mongodb Query
问题描述
我想根据一个特定字段查找最大连续记录数.
I want to find the count of maximum consecutive records based on one particular field.
找到基于字段的排序后,我的db.people
集合是:
My db.people
collection after finding sort based on field is:
> db.people.find().sort({ updated_at: 1})
{ "_id" : 1, "name" : "aaa", "flag" : true, "updated_at" : ISODate("2014-02-07T08:42:48.688Z") }
{ "_id" : 2, "name" : "bbb", "flag" : false, "updated_at" : ISODate("2014-02-07T08:43:10Z") }
{ "_id" : 3, "name" : "ccc", "flag" : true, "updated_at" : ISODate("2014-02-07T08:43:40.660Z") }
{ "_id" : 4, "name" : "ddd", "flag" : true, "updated_at" : ISODate("2014-02-07T08:43:51.567Z") }
{ "_id" : 6, "name" : "fff", "flag" : false, "updated_at" : ISODate("2014-02-07T08:44:23.713Z") }
{ "_id" : 7, "name" : "ggg", "flag" : true, "updated_at" : ISODate("2014-02-07T08:44:44.639Z") }
{ "_id" : 8, "name" : "hhh", "flag" : true, "updated_at" : ISODate("2014-02-07T08:44:51.415Z") }
{ "_id" : 5, "name" : "eee", "flag" : true, "updated_at" : ISODate("2014-02-07T08:55:24.917Z") }
在上面的记录中,flag
属性值以连续的方式出现在true
的两个位置.即
In above records, there are two places where flag
attribute value comes true
in consecutive ways. i.e
record with _id 3 - record with _id 4 (2 consecutive records)
和
record with _id 7 - record with _id 8 - record with _id 5 (3 consecutive records)
但是,我想要mongo查询搜索中的最大连续数.即3
.
However, I want the maximum consecutive number from mongo query search. i.e 3
.
有可能得到这样的结果吗?
Is it possible to get such result?
我在Google上搜索了它,发现了使用 Map-Reduce
的类似解决方案此处 https://stackoverflow.com/a/7408639/1120530 .
I googled it and found a little similar solution of using Map-Reduce
here https://stackoverflow.com/a/7408639/1120530.
我是mongodb的新手,无法理解map-reduce
文档,尤其是在上述情况下如何应用它.
I am new to mongodb and couldn't able to understand the map-reduce
documentation and specially how to apply it in above scenario.
推荐答案
您可以执行此mapReduce操作.
You can do this mapReduce operation.
首先,映射器:
var mapper = function () {
if ( this.flag == true ) {
totalCount++;
} else {
totalCount = 0;
}
if ( totalCount != 0 ) {
emit (
counter,
{ _id: this._id, totalCount: totalCount }
);
} else {
counter++;
}
};
保留true
值在标志中可见的总次数的运行计数.如果该计数大于1,则我们发出该值,其中也包含文档_id
.当标志为false
时,用于密钥的另一个计数器将递增,以使匹配具有分组的密钥".
Which keeps a running count of the total times that the true
value is seen in flag. If that count is more than 1 then we emit the the value, also containing the document _id
. Another counter which is used for the key is incremented when the flag is false
, in order to have a grouping "key" for the matches.
然后减速器:
var reducer = function ( key, values ) {
var result = { docs: [] };
values.forEach(function(value) {
result.docs.push(value._id);
result.totalCount = value.totalCount;
});
return result;
};
只需将_id
值与totalCount一起推到结果数组中即可.
Simply pushes the _id
values onto a result array along with the totalCount.
然后运行:
db.people.mapReduce(
mapper,
reducer,
{
"out": { "inline": 1 },
"scope": {
"totalCount": 0,
"counter": 0
},
"sort": { "updated_at": 1 }
}
)
因此,使用mapper
和reducer
函数,然后定义作用域"中使用的全局变量,并传递updated_at
日期所需的排序".得到结果:
So with the mapper
and reducer
functions, we then define the global variables used in "scope" and pass in the "sort" that was required on updated_at
dates. Which gives the result:
{
"results" : [
{
"_id" : 1,
"value" : {
"docs" : [
3,
4
],
"totalCount" : 2
}
},
{
"_id" : 2,
"value" : {
"docs" : [
7,
8,
5
],
"totalCount" : 3
}
}
],
"timeMillis" : 2,
"counts" : {
"input" : 7,
"emit" : 5,
"reduce" : 2,
"output" : 2
},
"ok" : 1,
}
当然,您可以跳过totalCount
变量,而只使用数组长度,这是相同的.但是,由于无论如何都想使用该计数器,所以它刚刚被添加进来.但这就是原理.
Of course you could just skip the totalCount
variable and just use the array length, which would be the same. But since you want to use that counter anyway it's just added in. But that's the principle.
是的,这是一个适合mapReduce的问题,现在您有一个例子.
So yes, this was a problem suited to mapReduce, and now you have an example.
这篇关于根据Mongodb查询中的一个字段查找最大连续记录数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!