mongodb聚合查询在使用$ sum时未返回正确的总和 [英] mongodb aggregate query isn't returning proper sum on using $sum

查看:92
本文介绍了mongodb聚合查询在使用$ sum时未返回正确的总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一组学生,他们的文件格式如下:-

I have a collection students with documents in the following format:-

{
 _id:"53fe74a866455060e003c2db",
 name:"sam",
 subject:"maths",
 marks:"77"
}
{
 _id:"53fe79cbef038fee879263d2",
 name:"ryan", 
 subject:"bio",
 marks:"82"
}
{
 _id:"53fe74a866456060e003c2de",
 name:"tony",
 subject:"maths",
 marks:"86"
}

我想获取所有主题=数学"的学生的总成绩.所以我应该得到163.

I want to get the count of total marks of all the students with subject = "maths". So I should get 163 as sum.

db.students.aggregate([{ $match : { subject : "maths" } },
{ "$group" : { _id : "$subject", totalMarks : { $sum : "$marks" } } }])

现在我应该得到以下结果-

Now I should get the following result-

{"result":[{"_id":"53fe74a866455060e003c2db", "totalMarks":163}], "ok":1}

但我明白了

{"result":[{"_id":"53fe74a866455060e003c2db", "totalMarks":0}], "ok":1}

有人可以指出我在这里做错了什么吗?

Can someone point out what I might be doing wrong here?

推荐答案

您当前的模式的marks字段数据类型为字符串,并且您的聚合框架需要整数数据类型才能得出总和.另一方面,您可以使用 MapReduce 计算总和因为它允许在其map函数中的对象属性上使用本机JavaScript方法(例如parseInt()).因此,总的来说,您有两种选择.

Your current schema has the marks field data type as string and you need an integer data type for your aggregation framework to work out the sum. On the other hand, you can use MapReduce to calculate the sum since it allows the use of native JavaScript methods like parseInt() on your object properties in its map functions. So overall you have two choices.

第一个是更改架构或在文档中添加具有实际数字值而不是字符串表示形式的另一个字段.如果收集文档的尺寸较小,则可以组合使用mongodb的光标 find() forEach() update() 更改标记方案的方法:

The first would be to change the schema or add another field in your document that has the actual numerical value not the string representation. If your collection document size is relatively small, you could use a combination of the mongodb's cursor find(), forEach() and update() methods to change your marks schema:

db.student.find({ "marks": { "$type": 2 } }).snapshot().forEach(function(doc) {
    db.student.update(
        { "_id": doc._id, "marks": { "$type": 2 } }, 
        { "$set": { "marks": parseInt(doc.marks) } }
    );
});

对于相对较大的集合大小,您的数据库性能会变慢,建议使用

For relatively large collection sizes, your db performance will be slow and it's recommended to use mongo bulk updates for this:

MongoDB版本> = 2.6和< 3.2:

var bulk = db.student.initializeUnorderedBulkOp(),
    counter = 0;

db.student.find({"marks": {"$exists": true, "$type": 2 }}).forEach(function (doc) {    
    bulk.find({ "_id": doc._id }).updateOne({ 
        "$set": { "marks": parseInt(doc.marks) } 
    });

    counter++;
    if (counter % 1000 === 0) {
        // Execute per 1000 operations 
        bulk.execute(); 

        // re-initialize every 1000 update statements
        bulk = db.student.initializeUnorderedBulkOp();
    }
})

// Clean up remaining operations in queue
if (counter % 1000 !== 0) bulk.execute(); 

MongoDB 3.2版及更高版本:

var ops = [],
    cursor = db.student.find({"marks": {"$exists": true, "$type": 2 }});

cursor.forEach(function (doc) {     
    ops.push({ 
        "updateOne": { 
            "filter": { "_id": doc._id } ,              
            "update": { "$set": { "marks": parseInt(doc.marks) } } 
        }         
    });

    if (ops.length === 1000) {
        db.student.bulkWrite(ops);
        ops = [];
    }     
});

if (ops.length > 0) db.student.bulkWrite(ops);


选项2:运行MapReduce

第二种方法是使用 MapReduce ,您可以在其中使用JavaScript函数parseInt().


Option 2: Run MapReduce

The second approach would be to rewrite your query with MapReduce where you can use the JavaScript function parseInt().

在您的 MapReduce 操作中,定义用于处理每个输入文档.此函数将每个文档的转换后的marks字符串值映射到subject,并发出subject和转换后的marks对.这是可以应用JavaScript本机函数parseInt()的地方.注意:在函数中,this是指map-reduce操作正在处理的文档:

In your MapReduce operation, define the map function that process each input document. This function maps the converted marks string value to the subject for each document, and emits the subject and converted marks pair. This is where the JavaScript native function parseInt() can be applied. Note: in the function, this refers to the document that the map-reduce operation is processing:

var mapper = function () {
    var x = parseInt(this.marks);
    emit(this.subject, x);
};

接下来,使用两个参数keySubjectvaluesMarks定义相应的reduce函数. valuesMarks是一个数组,其元素是由映射函数发出并由keySubject分组的整数marks值. 该函数将valuesMarks数组减少为其元素的总和.

Next, define the corresponding reduce function with two arguments keySubject and valuesMarks. valuesMarks is an array whose elements are the integer marks values emitted by the map function and grouped by keySubject. The function reduces the valuesMarks array to the sum of its elements.

var reducer = function(keySubject, valuesMarks) {
    return Array.sum(valuesMarks);
};

db.student.mapReduce(
    mapper,
    reducer,
    {
        out : "example_results",
        query: { subject : "maths" }       
    }
 );

对于您的收藏集,以上内容将把您的MapReduce聚合结果放入新的收藏集db.example_results中.因此,db.example_results.find()将输出:

With your collection, the above will put your MapReduce aggregation result in a new collection db.example_results. Thus, db.example_results.find() will output:

/* 0 */
{
    "_id" : "maths",
    "value" : 163
}

这篇关于mongodb聚合查询在使用$ sum时未返回正确的总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆