mongodb 聚合查询在使用 $sum 时没有返回正确的总和 [英] mongodb aggregate query isn't returning proper sum on using $sum

查看:17
本文介绍了mongodb 聚合查询在使用 $sum 时没有返回正确的总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含以下格式文档的学生集合:-

I have a collection students with documents in the following format:-

{
 _id:"53fe74a866455060e003c2db",
 name:"sam",
 subject:"maths",
 marks:"77"
}
{
 _id:"53fe79cbef038fee879263d2",
 name:"ryan", 
 subject:"bio",
 marks:"82"
}
{
 _id:"53fe74a866456060e003c2de",
 name:"tony",
 subject:"maths",
 marks:"86"
}

我想得到所有科目=数学"的学生的总分.所以我应该得到 163 作为总和.

I want to get the count of total marks of all the students with subject = "maths". So I should get 163 as sum.

db.students.aggregate([{ $match : { subject : "maths" } },
{ "$group" : { _id : "$subject", totalMarks : { $sum : "$marks" } } }])

现在我应该得到以下结果-

Now I should get the following result-

{"result":[{"_id":"53fe74a866455060e003c2db", "totalMarks":163}], "ok":1}

但我明白-

{"result":[{"_id":"53fe74a866455060e003c2db", "totalMarks":0}], "ok":1}

有人可以指出我在这里可能做错了什么吗?

Can someone point out what I might be doing wrong here?

推荐答案

您当前的架构将 marks 字段数据类型设为字符串,并且您的聚合框架需要一个整数数据类型来计算和.另一方面,您可以使用 MapReduce 来计算总和因为它允许在其映射函数中的对象属性上使用原生 JavaScript 方法,例如 parseInt().所以总的来说你有两个选择.

Your current schema has the marks field data type as string and you need an integer data type for your aggregation framework to work out the sum. On the other hand, you can use MapReduce to calculate the sum since it allows the use of native JavaScript methods like parseInt() on your object properties in its map functions. So overall you have two choices.

首先是更改架构或在文档中添加另一个具有实际数值而不是字符串表示的字段.如果你的collection文档比较小,可以结合使用mongodb的游标find(),forEach()update() 方法来更改您的标记架构:

The first would be to change the schema or add another field in your document that has the actual numerical value not the string representation. If your collection document size is relatively small, you could use a combination of the mongodb's cursor find(), forEach() and update() methods to change your marks schema:

db.student.find({ "marks": { "$type": 2 } }).snapshot().forEach(function(doc) {
    db.student.update(
        { "_id": doc._id, "marks": { "$type": 2 } }, 
        { "$set": { "marks": parseInt(doc.marks) } }
    );
});

对于比较大的集合,你的db性能会很慢,建议使用mongo 批量更新:

For relatively large collection sizes, your db performance will be slow and it's recommended to use mongo bulk updates for this:

MongoDB 版本 >= 2.6 和 <3.2:

var bulk = db.student.initializeUnorderedBulkOp(),
    counter = 0;

db.student.find({"marks": {"$exists": true, "$type": 2 }}).forEach(function (doc) {    
    bulk.find({ "_id": doc._id }).updateOne({ 
        "$set": { "marks": parseInt(doc.marks) } 
    });

    counter++;
    if (counter % 1000 === 0) {
        // Execute per 1000 operations 
        bulk.execute(); 

        // re-initialize every 1000 update statements
        bulk = db.student.initializeUnorderedBulkOp();
    }
})

// Clean up remaining operations in queue
if (counter % 1000 !== 0) bulk.execute(); 

MongoDB 3.2 及更新版本:

var ops = [],
    cursor = db.student.find({"marks": {"$exists": true, "$type": 2 }});

cursor.forEach(function (doc) {     
    ops.push({ 
        "updateOne": { 
            "filter": { "_id": doc._id } ,              
            "update": { "$set": { "marks": parseInt(doc.marks) } } 
        }         
    });

    if (ops.length === 1000) {
        db.student.bulkWrite(ops);
        ops = [];
    }     
});

if (ops.length > 0) db.student.bulkWrite(ops);

<小时>

选项 2:运行 MapReduce

第二种方法是使用 MapReduce 您可以在其中使用 JavaScript 函数 parseInt().


Option 2: Run MapReduce

The second approach would be to rewrite your query with MapReduce where you can use the JavaScript function parseInt().

在您的 MapReduce 操作中,定义映射函数处理每个输入文档.此函数将转换后的 marks 字符串值映射到每个文档的 subject,并发出 subject 和转换后的 marks一对.这是可以应用 JavaScript 原生函数 parseInt() 的地方.注意:函数中this指的是map-reduce操作正在处理的文档:

In your MapReduce operation, define the map function that process each input document. This function maps the converted marks string value to the subject for each document, and emits the subject and converted marks pair. This is where the JavaScript native function parseInt() can be applied. Note: in the function, this refers to the document that the map-reduce operation is processing:

var mapper = function () {
    var x = parseInt(this.marks);
    emit(this.subject, x);
};

接下来,使用两个参数keySubjectvaluesMarks 定义相应的reduce 函数.valuesMarks 是一个数组,其元素是由 map 函数发出并按 keySubject 分组的整数 marks 值.该函数将 valuesMarks 数组减少为其元素的总和.

Next, define the corresponding reduce function with two arguments keySubject and valuesMarks. valuesMarks is an array whose elements are the integer marks values emitted by the map function and grouped by keySubject. The function reduces the valuesMarks array to the sum of its elements.

var reducer = function(keySubject, valuesMarks) {
    return Array.sum(valuesMarks);
};

db.student.mapReduce(
    mapper,
    reducer,
    {
        out : "example_results",
        query: { subject : "maths" }       
    }
 );

对于您的集合,上面的代码会将您的 MapReduce 聚合结果放入一个新集合 db.example_results.因此,db.example_results.find() 将输出:

With your collection, the above will put your MapReduce aggregation result in a new collection db.example_results. Thus, db.example_results.find() will output:

/* 0 */
{
    "_id" : "maths",
    "value" : 163
}

这篇关于mongodb 聚合查询在使用 $sum 时没有返回正确的总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆