根据 Mongodb Query 中的一个字段查找最大连续记录数 [英] Find count of maximum consecutive records based on one field in Mongodb Query

查看:27
本文介绍了根据 Mongodb Query 中的一个字段查找最大连续记录数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想根据一个特定字段查找最大连续记录数.

我的 db.people 集合在找到基于字段的排序后是:

<代码>>db.people.find().sort({ updated_at: 1}){_id":1,名称":aaa",标志":真,updated_at":ISODate(2014-02-07T08:42:48.688Z")}{_id":2,名称":bbb",标志":假,updated_at":ISODate(2014-02-07T08:43:10Z")}{_id":3,名称":ccc",标志":真,updated_at":ISODate(2014-02-07T08:43:40.660Z")}{_id":4,名称":ddd",标志":真,updated_at":ISODate(2014-02-07T08:43:51.567Z")}{_id":6,名称":fff",标志":假,updated_at":ISODate(2014-02-07T08:44:23.713Z")}{_id":7,名称":ggg",标志":真,updated_at":ISODate(2014-02-07T08:44:44.639Z")}{_id":8,名称":hhh",标志":真,updated_at":ISODate(2014-02-07T08:44:51.415Z")}{_id":5,名称":eee",标志":真,updated_at":ISODate(2014-02-07T08:55:24.917Z")}

在上述记录中,flag属性值连续出现true的地方有两处.即

_id 3 记录 - _id 4 记录(2 个连续记录)

_id 7 的记录 - _id 8 的记录 - _id 5 的记录(3 个连续记录)

但是,我想要 mongo 查询搜索的最大连续数.即 3.

有可能得到这样的结果吗?

我用谷歌搜索它并找到了一个使用 Map- 的类似解决方案减少这里https://stackoverflow.com/a/7408639/1120530.p>

我是 mongodb 新手,无法理解 map-reduce 文档,特别是如何在上述场景中应用它.

解决方案

你可以做这个mapReduce操作.

首先是映射器:

var mapper = function () {如果(this.flag == true){总数++;} 别的 {总数 = 0;}如果(总数!= 0){发射 (柜台,{ _id:this._id,totalCount:totalCount });} 别的 {计数器++;}};

它会记录在标志中看到 true 值的总次数.如果该计数大于 1,则我们发出该值,还包含文档 _id.当标志为 false 时,另一个用于键的计数器会递增,以便为匹配项提供分组键".

然后是reducer:

var reducer = function ( key, values ) {var 结果 = { 文档:[] };values.forEach(函数(值){结果.docs.push(value._id);结果.totalCount = value.totalCount;});返回结果;};

只需将 _id 值与 totalCount 一起推送到结果数组中.

然后运行:

db.people.mapReduce(映射器,减速器,{出":{内联":1},范围": {总数":0,计数器":0},排序":{updated_at":1}})

因此,使用 mapperreducer 函数,然后我们定义范围"中使用的全局变量并传入 所需的排序"updated_at 日期.结果如下:

{结果" : [{_id":1,价值" : {文档":[3、4],总数":2}},{_id":2,价值" : {文档":[7、8、5],总数":3}}],时间米利斯":2,计数":{输入":7,发射":5,减少":2,输出":2},好":1,}

当然,您可以跳过 totalCount 变量而只使用数组长度,这将是相同的.但是既然你想使用那个计数器,它就被添加进去了.但这就是原则.

所以是的,这是一个适合 mapReduce 的问题,现在你有一个例子.

I want to find the count of maximum consecutive records based on one particular field.

My db.people collection after finding sort based on field is:

> db.people.find().sort({ updated_at: 1})
{ "_id" : 1, "name" : "aaa", "flag" : true, "updated_at" : ISODate("2014-02-07T08:42:48.688Z") }
{ "_id" : 2, "name" : "bbb", "flag" : false, "updated_at" : ISODate("2014-02-07T08:43:10Z") }
{ "_id" : 3, "name" : "ccc", "flag" : true, "updated_at" : ISODate("2014-02-07T08:43:40.660Z") }
{ "_id" : 4, "name" : "ddd", "flag" : true, "updated_at" : ISODate("2014-02-07T08:43:51.567Z") }
{ "_id" : 6, "name" : "fff", "flag" : false, "updated_at" : ISODate("2014-02-07T08:44:23.713Z") }
{ "_id" : 7, "name" : "ggg", "flag" : true, "updated_at" : ISODate("2014-02-07T08:44:44.639Z") }
{ "_id" : 8, "name" : "hhh", "flag" : true, "updated_at" : ISODate("2014-02-07T08:44:51.415Z") }
{ "_id" : 5, "name" : "eee", "flag" : true, "updated_at" : ISODate("2014-02-07T08:55:24.917Z") }

In above records, there are two places where flag attribute value comes true in consecutive ways. i.e

record with _id 3 - record with _id 4   (2 consecutive records)

and

record with _id 7 - record with _id 8 - record with _id 5  (3 consecutive records)

However, I want the maximum consecutive number from mongo query search. i.e 3.

Is it possible to get such result?

I googled it and found a little similar solution of using Map-Reduce here https://stackoverflow.com/a/7408639/1120530.

I am new to mongodb and couldn't able to understand the map-reduce documentation and specially how to apply it in above scenario.

解决方案

You can do this mapReduce operation.

First the mapper:

var mapper = function () {


    if ( this.flag == true ) {
        totalCount++;
    } else {
        totalCount = 0;
    }

    if ( totalCount != 0 ) {
        emit (
        counter,
        {  _id: this._id, totalCount: totalCount }
    );
    } else {
      counter++;
    }

};

Which keeps a running count of the total times that the true value is seen in flag. If that count is more than 1 then we emit the the value, also containing the document _id. Another counter which is used for the key is incremented when the flag is false, in order to have a grouping "key" for the matches.

Then the reducer:

var reducer = function ( key, values ) {

    var result = { docs: [] };

    values.forEach(function(value) {
        result.docs.push(value._id);
        result.totalCount = value.totalCount;
    });

    return result;

};

Simply pushes the _id values onto a result array along with the totalCount.

Then run:

db.people.mapReduce(
    mapper,
    reducer,
   { 
       "out": { "inline": 1 }, 
       "scope": { 
           "totalCount": 0, 
           "counter": 0 
       }, 
       "sort": { "updated_at": 1 } 
   }
)

So with the mapper and reducer functions, we then define the global variables used in "scope" and pass in the "sort" that was required on updated_at dates. Which gives the result:

{
    "results" : [
        {
            "_id" : 1,
            "value" : {
                "docs" : [
                     3,
                     4
                 ],
                 "totalCount" : 2
            }
        },
        {
            "_id" : 2,
            "value" : {
            "docs" : [
                7,
                8,
                5
             ],
             "totalCount" : 3
             }
        }
    ],
    "timeMillis" : 2,
    "counts" : {
            "input" : 7,
            "emit" : 5,
            "reduce" : 2,
            "output" : 2
    },
    "ok" : 1,
}

Of course you could just skip the totalCount variable and just use the array length, which would be the same. But since you want to use that counter anyway it's just added in. But that's the principle.

So yes, this was a problem suited to mapReduce, and now you have an example.

这篇关于根据 Mongodb Query 中的一个字段查找最大连续记录数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆