在Mongo中进行合并和制表(唯一/计数) [英] Binning and tabulate (unique/count) in Mongo

查看:93
本文介绍了在Mongo中进行合并和制表(唯一/计数)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种使用Mongo生成一些摘要统计信息的方法.假设我有一个包含许多

I am looking for a way to generate some summary statistics using Mongo. Suppose I have a collection with many records of the form

{"name" : "Jeroen", "gender" : "m", "age" :27.53 }

现在,我想获取性别和年龄的分布.假设性别,只有值"m""f".获取我收藏中的男性和女性总数的最有效方法是什么?

Now I want to get the distributions for gender and age. Assume for gender, there are only values "m" and "f". What is the most efficient way of getting the total count of males and females in my collection?

对于年龄,有没有一种方法可以做一些分箱",并给我一个像摘要这样的直方图;即年龄段中的记录数:[0, 2), [2, 4), [4, 6) ...等?

And for age, is there a way that does some 'binning' and gives me a histogram like summary; i.e. the number of records where age is in the intervals: [0, 2), [2, 4), [4, 6) ... etc?

推荐答案

康斯坦丁的答案是正确的. MapReduce完成任务.如果其他人觉得这很有趣,这里是完整的解决方案.

Konstantin's answer was right. MapReduce gets the job done. Here is the full solution in case others find this interesting.

要计算性别,地图功能键是每条记录的this.gender属性.然后reduce函数将它们简单地相加:

To count genders, the map function key is the this.gender attribute for every record. The reduce function then simply adds them up:

// count genders
db.persons.mapReduce(
    function(){
        emit(this["gender"], {count: 1})
    }, function(key, values){
        var result = {count: 0};
        values.forEach(function(value) {
            result.count += value.count;
        });
        return result;
    }, {out: { inline : 1}}
);

要进行合并,我们将map函数中的键设置为四舍五入到最接近的二分之一.因此例如10到11.9999之间的任何值都将获得相同的键"10-12".然后我们再次简单地将它们加起来:

To do the binning, we set the key in the map function to round down to the nearest division by two. Therefore e.g. any value between 10 and 11.9999 will get the same key "10-12". And then again we simply add them up:

db.responses.mapReduce(
    function(){
        var x = Math.floor(this["age"]/2)*2;
        var key = x + "-" + (x+2);
        emit(key, {count: 1})
    }, function(state, values){
        var result = {count: 0};
        values.forEach(function(value) {
            result.count += value.count;
        });
        return result;
    }, {out: { inline : 1}}
);

这篇关于在Mongo中进行合并和制表(唯一/计数)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆