根据Mongodb查询中的一个字段查找最大连续记录数 [英] Find count of maximum consecutive records based on one field in Mongodb Query

查看:563
本文介绍了根据Mongodb查询中的一个字段查找最大连续记录数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想根据一个特定字段查找最大连续记录数.

I want to find the count of maximum consecutive records based on one particular field.

找到基于字段的排序后,我的db.people集合是:

My db.people collection after finding sort based on field is:

> db.people.find().sort({ updated_at: 1})
{ "_id" : 1, "name" : "aaa", "flag" : true, "updated_at" : ISODate("2014-02-07T08:42:48.688Z") }
{ "_id" : 2, "name" : "bbb", "flag" : false, "updated_at" : ISODate("2014-02-07T08:43:10Z") }
{ "_id" : 3, "name" : "ccc", "flag" : true, "updated_at" : ISODate("2014-02-07T08:43:40.660Z") }
{ "_id" : 4, "name" : "ddd", "flag" : true, "updated_at" : ISODate("2014-02-07T08:43:51.567Z") }
{ "_id" : 6, "name" : "fff", "flag" : false, "updated_at" : ISODate("2014-02-07T08:44:23.713Z") }
{ "_id" : 7, "name" : "ggg", "flag" : true, "updated_at" : ISODate("2014-02-07T08:44:44.639Z") }
{ "_id" : 8, "name" : "hhh", "flag" : true, "updated_at" : ISODate("2014-02-07T08:44:51.415Z") }
{ "_id" : 5, "name" : "eee", "flag" : true, "updated_at" : ISODate("2014-02-07T08:55:24.917Z") }

在上面的记录中,flag属性值以连续的方式出现在true的两个位置.即

In above records, there are two places where flag attribute value comes true in consecutive ways. i.e

record with _id 3 - record with _id 4   (2 consecutive records)

record with _id 7 - record with _id 8 - record with _id 5  (3 consecutive records)

但是,我想要mongo查询搜索中的最大连续数.即3.

However, I want the maximum consecutive number from mongo query search. i.e 3.

有可能得到这样的结果吗?

Is it possible to get such result?

我在Google上搜索了它,发现了使用 Map-Reduce 的类似解决方案此处 https://stackoverflow.com/a/7408639/1120530 .

I googled it and found a little similar solution of using Map-Reduce here https://stackoverflow.com/a/7408639/1120530.

我是mongodb的新手,无法理解map-reduce文档,尤其是在上述情况下如何应用它.

I am new to mongodb and couldn't able to understand the map-reduce documentation and specially how to apply it in above scenario.

推荐答案

您可以执行此mapReduce操作.

You can do this mapReduce operation.

首先,映射器:

var mapper = function () {


    if ( this.flag == true ) {
        totalCount++;
    } else {
        totalCount = 0;
    }

    if ( totalCount != 0 ) {
        emit (
        counter,
        {  _id: this._id, totalCount: totalCount }
    );
    } else {
      counter++;
    }

};

保留true值在标志中可见的总次数的运行计数.如果该计数大于1,则我们发出该值,其中也包含文档_id.当标志为false时,用于密钥的另一个计数器将递增,以使匹配具有分组的密钥".

Which keeps a running count of the total times that the true value is seen in flag. If that count is more than 1 then we emit the the value, also containing the document _id. Another counter which is used for the key is incremented when the flag is false, in order to have a grouping "key" for the matches.

然后减速器:

var reducer = function ( key, values ) {

    var result = { docs: [] };

    values.forEach(function(value) {
        result.docs.push(value._id);
        result.totalCount = value.totalCount;
    });

    return result;

};

只需将_id值与totalCount一起推到结果数组中即可.

Simply pushes the _id values onto a result array along with the totalCount.

然后运行:

db.people.mapReduce(
    mapper,
    reducer,
   { 
       "out": { "inline": 1 }, 
       "scope": { 
           "totalCount": 0, 
           "counter": 0 
       }, 
       "sort": { "updated_at": 1 } 
   }
)

因此,使用mapperreducer函数,然后定义作用域"中使用的全局变量,并传递updated_at日期所需的排序".得到结果:

So with the mapper and reducer functions, we then define the global variables used in "scope" and pass in the "sort" that was required on updated_at dates. Which gives the result:

{
    "results" : [
        {
            "_id" : 1,
            "value" : {
                "docs" : [
                     3,
                     4
                 ],
                 "totalCount" : 2
            }
        },
        {
            "_id" : 2,
            "value" : {
            "docs" : [
                7,
                8,
                5
             ],
             "totalCount" : 3
             }
        }
    ],
    "timeMillis" : 2,
    "counts" : {
            "input" : 7,
            "emit" : 5,
            "reduce" : 2,
            "output" : 2
    },
    "ok" : 1,
}

当然,您可以跳过totalCount变量,而只使用数组长度,这是相同的.但是,由于无论如何都想使用该计数器,所以它刚刚被添加进来.但这就是原理.

Of course you could just skip the totalCount variable and just use the array length, which would be the same. But since you want to use that counter anyway it's just added in. But that's the principle.

是的,这是一个适合mapReduce的问题,现在您有一个例子.

So yes, this was a problem suited to mapReduce, and now you have an example.

这篇关于根据Mongodb查询中的一个字段查找最大连续记录数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆