MongoDB 聚合:从先前行的总和计算运行总计 [英] MongoDB Aggregation: Compute Running Totals from sum of previous rows

查看:12
本文介绍了MongoDB 聚合:从先前行的总和计算运行总计的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

示例文件:

{ time: ISODate("2013-10-10T20:55:36Z"), value: 1 }
{ time: ISODate("2013-10-10T22:43:16Z"), value: 2 }
{ time: ISODate("2013-10-11T19:12:66Z"), value: 3 }
{ time: ISODate("2013-10-11T10:15:38Z"), value: 4 }
{ time: ISODate("2013-10-12T04:15:38Z"), value: 5 }

很容易获得按日期分组的汇总结果.但我想要的是查询返回运行总数的结果聚合,例如:

It's easy to get the aggregated results that is grouped by date. But what I want is to query results that returns a running total of the aggregation, like:

{ time: "2013-10-10" total: 3, runningTotal: 3  }
{ time: "2013-10-11" total: 7, runningTotal: 10 }
{ time: "2013-10-12" total: 5, runningTotal: 15 }

MongoDB 聚合可以做到这一点吗?

Is this possible with the MongoDB Aggregation?

推荐答案

这就是你所需要的.我已将数据中的时间标准化,以便它们组合在一起(您可以执行类似 这个).这个想法是 $group 并将 timetotal 推送到单独的数组中.然后$unwind time 数组,您为每个time 文档制作了totals 数组的副本.然后,您可以从包含不同时间的所有数据的数组中计算出 runningTotal(或类似滚动平均值).$unwind 生成的'index' 是对应于该timetotal 的数组索引.在 $unwind 之前进行 $sort 很重要,因为这可以确保数组的顺序正确.

This does what you need. I have normalised the times in the data so they group together (You could do something like this). The idea is to $group and push the time's and total's into separate arrays. Then $unwind the time array, and you have made a copy of the totals array for each time document. You can then calculated the runningTotal (or something like the rolling average) from the array containing all the data for different times. The 'index' generated by $unwind is the array index for the total corresponding to that time. It is important to $sort before $unwinding since this ensures the arrays are in the correct order.

db.temp.aggregate(
    [
        {
            '$group': {
                '_id': '$time',
                'total': { '$sum': '$value' }
            }
        },
        {
            '$sort': {
                 '_id': 1
            }
        },
        {
            '$group': {
                '_id': 0,
                'time': { '$push': '$_id' },
                'totals': { '$push': '$total' }
            }
        },
        {
            '$unwind': {
                'path' : '$time',
                'includeArrayIndex' : 'index'
            }
        },
        {
            '$project': {
                '_id': 0,
                'time': { '$dateToString': { 'format': '%Y-%m-%d', 'date': '$time' }  },
                'total': { '$arrayElemAt': [ '$totals', '$index' ] },
                'runningTotal': { '$sum': { '$slice': [ '$totals', { '$add': [ '$index', 1 ] } ] } },
            }
        },
    ]
);

我在一个包含约 80 000 个文档的集合上使用了类似的东西,总计 63 个结果.我不确定它在更大的集合上的效果如何,但我发现一旦数据减少到可管理的大小,对聚合数据执行转换(投影、数组操作)似乎不会产生很大的性能成本.

I have used something similar on a collection with ~80 000 documents, aggregating to 63 results. I am not sure how well it will work on larger collections, but I have found that performing transformations(projections, array manipulations) on aggregated data does not seem to have a large performance cost once the data is reduced to a manageable size.

这篇关于MongoDB 聚合:从先前行的总和计算运行总计的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆