使用MongoDB聚合框架获取数组大小直方图的最快方法 [英] Fastest way to get histogram of array sizes using MongoDB aggregation framework

查看:268
本文介绍了使用MongoDB聚合框架获取数组大小直方图的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试获取具有可变大小数组的记录数的列表.我想获取所有记录的数组大小分布,这样我就可以构建一个直方图:

I'm trying to get a list of the number of records that have arrays of varying size. I want to get the distribution of array sizes for all records so I can build a histogram like this:

          | *
          | *
documents | *         *
          | *  *      *
          |_*__*__*___*__*___
            2  5  6  23  47

               Array Size

因此原始文档看起来像这样:

So the raw documents look something like this:

{hubs : [{stuff:0, id:6}, {stuff:1"}, .... ]}
{hubs : [{stuff:0, id:6}]}`

到目前为止,在此处我想出了

So far using the aggregation framework and some of the help here I've come up with

db.sitedata.aggregate([{ $unwind:'$hubs'}, 
                       { $group : {_id:'$_id', count:{$sum:1}}}, 
                       { $group : {_id:'$count', count:{$sum:1}}},
                       { $sort  : {_id: 1}}])

这似乎给了我想要的结果,但是速度不是很快.我想知道是否有我可以做的事情,可能不需要两个小组呼叫.这里的语法是错误的,但是我想做的是将计数值放在第一个_id字段中:

This seems to give me the results I want, but it's not very fast. I'm wondering if there is something I can do like this that may not need two group calls. The syntax is wrong here, but what I'm trying to do is put the count value in the first _id field:

db.sitedata.aggregate([{ $unwind:'$hubs'}, 
                       { $group : {_id:{$count:$hubs}, count:1}},
                       { $sort  : { _id: 1 }}])

推荐答案

现在2.6已经发布,聚合框架支持

Now that 2.6 is out, aggregation framework supports a new array operator $size which will allow you to $project the array size without having to unwind and re-group.

db.sitedata.aggregate([{ $project:{ 'count': { '$size':'$hubs'} } }, 
                       { $group : {_id:'$count', count:{$sum:1} } },
                       { $sort  : { _id: 1 } } ] )

这篇关于使用MongoDB聚合框架获取数组大小直方图的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆