按日然后按小时进行Mongodb聚合 [英] Mongodb Aggregation by Day then Hour

查看:135
本文介绍了按日然后按小时进行Mongodb聚合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用mongodb聚合来聚合数据集.我的情况有点复杂.我的收藏如下:

I am using mongodb aggregation to aggregate set of data. My situation is a bit complex. I've a collection as following:

{
  startTime: ISODate("2014-12-31T10:20:30Z"),
  customerId: 123,
  ping: "2",
  link: "3"
}

现在,我想将数据汇总到另一个集合中,如下所示:

Now I want to aggregate data to another collection as following:

{
_id: {
 day: ISODate("2014-12-31T00:00:00Z"),
 customerId: 123
 },
hours: [
  {
   hour: ISODate("2014-12-31T10:00:00Z"),
   pings: 2,
   links: 3
  },
  {
   hour: ISODate("2014-12-31T11:00:00Z"),
   pings: 5,
   links: 6
  }
 ]
}

您可以看到,数据首先按天分组,然后按小时分组.我有以下汇总查询按天分组,但如何按小时进一步分组呢?有想法吗?

As you can see the data is group by day first and then by hours. I've got following aggregation query to group them by day but how to group them further by hours? Any Idea?

var pipeline = [
{
 $project : {  
       startTime : 1,
               customerId: 1,
       ping:1,
       link:1,
       date : "$startTime",  
       h : {  
            "$hour" : "$startTime"  
       },  
       m : {  
            "$minute" : "$startTime"  
       },  
       s : {  
            "$second" : "$startTime"  
       },  
       ml : {  
            "$millisecond" : "$startTime"  
       }  
  }
},
{
$project: {
    startTime : 1,
            customerId: 1,
    ping:1,
    link:1,
      date : {      
            "$subtract" : [      
                 "$date",      
                 {      
                      "$add" : [      
                           "$ml",      
                           {      
                                "$multiply" : [      
                                     "$s",      
                                     1000      
                                ]      
                           },      
                           {      
                                "$multiply" : [      
                                     "$m",      
                                     60,      
                                     1000      
                                ]      
                           },
                           {      
                                "$multiply" : [      
                                     "$h",      
                                     60,      
                                     60,      
                                     1000 
                                ]      
                           }      
                      ]      
                 }      
            ]      
       }
    }          
},
{
    $match: {
        "startTime": {
            $gte: new ISODate("2013-12-01T07:00:00Z"),
            $lte: new ISODate("2014-01-01T08:00:00Z"),
        }
    }
},
// Aggregate the data
{
    $group: {
        _id: {day : "$date", customerId: "$customerId"},
        pings : {$sum: "$ping"},
        links : {$sum: "$links"}
    }
}
];

推荐答案

您基本上想要的是双重分组,但是您不能使用日期聚合运算符,只是相关部分:

What you basically want is a double grouping, but you do not get the entire date object back using the date aggregation operators, just the relevant parts:

db.collection.aggregate([
    { "$group": {
        "_id": {
            "customerId": "$customerId",
            "day": { "$dayOfYear": "$startTime" },
            "hour": { "$hour": "$startTime" }
        },
        "pings": { "$sum": "$ping" },
        "links": { "$sum": "$link" }
    }},
    { "$group": {
       "_id": {
           "customerId": "$_id.customerId",
           "day": "$_id.day"
       },
       "hours": { 
           "$push": { 
               "hour": "$_id.hour",
               "pings": "$pings",
               "links": "$links"
           }
       }
    }}
])

双重 $group 通过每天将结果放入数组中,为您提供所需的格式.样本中只有一个文档,但是基本上可以得到如下结果:

The double $group gives you the format you want by placing the results into an array per day. Single document in the sample, but you basically get results like this:

{
    "_id" : {
            "customerId" : 123,
            "day" : 365
    },
    "hours" : [
            {
                    "hour" : 10,
                    "pings" : 2,
                    "links" : 3
            }
    ]
}

如果您发现日期运算符的结果难以处理或想要简化日期对象的传递"结果,则可以将其转换为纪元时间戳:

If you find the results of the date operators to difficult to deal with or want a simplified "pass-through" result for date objects, then you could cast as epoch timestamps instead:

db.collection.aggregate([
    { "$group": {
        "_id": {
            "customerId": "$customerId",
            "day": {
               "$subtract": [
                   { "$subtract": [ "$startTime", new Date("1970-01-01") ] },
                   {
                       "$mod": [
                           { "$subtract": [ "$startTime", new Date("1970-01-01") ] },
                           1000*60*60*24   
                       ]
                   }
               ]
            },
            "hour": {
               "$subtract": [
                   { "$subtract": [ "$startTime", new Date("1970-01-01") ] },
                   {
                       "$mod": [
                           { "$subtract": [ "$startTime", new Date("1970-01-01") ] },
                           1000*60*60   
                       ]
                   }
               ]
            }
        },
        "pings": { "$sum": "$ping" },
        "links": { "$sum": "$link" }
    }},
    { "$group": {
       "_id": {
           "customerId": "$_id.customerId",
           "day": "$_id.day"
       },
       "hours": { 
           "$push": { 
               "hour": "$_id.hour",
               "pings": "$pings",
               "links": "$links"
           }
       }
    }}
])

当您 $subtract 一个日期对象与另一个日期对象,结果将返回"epoch"值.在这种情况下,我们使用时期"开始日期来获取整个时间戳记值,而只需提供日期数学"即可将时间更正为所需的时间间隔.结果是:

The trick in there is when you $subtract one date object from another you get the "epoch" value back as a result. In this case we use the "epoch" start date to get the whole timestamp value and just provide the "date math" to correct the times to the required intervals. So the result:

{
    "_id" : {
            "customerId" : 123,
            "day" : NumberLong("1419984000000")
    },
    "hours" : [
            {
                    "hour" : NumberLong("1420020000000"),
                    "pings" : 2,
                    "links" : 3
            }
    ]
}

根据您的需求,哪个可能比日期运算符提供的结果更可口.

Which might be more palatable to you than what the date operators provide as a result depending on your needs.

您还可以通过> strong> $let 运算符,使您可以为作用域操作声明变量":

You can also add a little shorthand for this with MongoDB 2.6 via the $let operator that allows you declare "variables" for scoped operations:

db.event.aggregate([
    { "$group": {
        "_id": {
            "$let": {
                "vars": { 
                   "date": { "$subtract": [ "$startTime", new Date("1970-01-01") ] },
                   "day": 1000*60*60*24,
                   "hour": 1000*60*60
                },
                "in": {
                    "customerId": "$customerId",
                    "day": {
                        "$subtract": [
                            "$$date",
                            { "$mod": [ "$$date", "$$day" ] }
                         ]
                    },
                    "hour": {
                        "$subtract": [
                            "$$date",
                            { "$mod": [ "$$date", "$$hour" ] }
                         ]
                    }
                }
            }
        },
        "pings": { "$sum": "$ping" },
        "links": { "$sum": "$link" }
    }},
    { "$group": {
       "_id": {
           "customerId": "$_id.customerId",
           "day": "$_id.day"
       },
       "hours": { 
           "$push": { 
               "hour": "$_id.hour",
               "pings": "$pings",
               "links": "$links"
           }
       }
    }}
])

我也几乎忘了提到您的"ping"和"link"值实际上是字符串,除非这是一个错字.但是,如果没有,请确保先将它们转换为数字.

Also I nearly forgot to mention that your values for "ping" and "link" are actually strings unless that is a typo. But if not, then make sure you convert them as numbers first.

这篇关于按日然后按小时进行Mongodb聚合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆