计算嵌入文档/数组中字段的平均值 [英] Calculate the average of fields in embedded documents/array

查看:33
本文介绍了计算嵌入文档/数组中字段的平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想用数组 ratings 中的 rating 字段计算该对象的 rating_average 字段.你能帮我理解如何使用 $avg 聚合吗?

I want to calculate the rating_average field of this object with the rating fields inside the array ratings. Can you help me to understand how to use aggregation with $avg?

{
    "title": "The Hobbit",
    "rating_average": "???",
    "ratings": [
        {
            "title": "best book ever",
            "rating": 5
        },
        {
            "title": "good book",
            "rating": 3.5
        }
    ]
}

推荐答案

聚合框架 在 MongoDB 3.4 和更新版本中提供了 $reduce 运算符,无需额外管道即可高效计算总数.考虑使用它作为表达式来返回总评分并使用 $size.连同 $addFields,因此可以使用算术运算符 $divide 如公式average = total ratings/number of ratings:

The aggregation framework in MongoDB 3.4 and newer offers the $reduce operator which efficiently calculates the total without the need for extra pipelines. Consider using it as an expression to return the total ratings and get the number of ratings using $size. Together with $addFields, the average can thus be calculated using the arithmetic operator $divide as in the formula average = total ratings/number of ratings:

db.collection.aggregate([
    { 
        "$addFields": { 
            "rating_average": {
                "$divide": [
                    { // expression returns total
                        "$reduce": {
                            "input": "$ratings",
                            "initialValue": 0,
                            "in": { "$add": ["$$value", "$$this.rating"] }
                        }
                    },
                    { // expression returns ratings count
                        "$cond": [
                            { "$ne": [ { "$size": "$ratings" }, 0 ] },
                            { "$size": "$ratings" }, 
                            1
                        ]
                    }
                ]
            }
        }
    }           
])

样本输出

{
    "_id" : ObjectId("58ab48556da32ab5198623f4"),
    "title" : "The Hobbit",
    "ratings" : [ 
        {
            "title" : "best book ever",
            "rating" : 5.0
        }, 
        {
            "title" : "good book",
            "rating" : 3.5
        }
    ],
    "rating_average" : 4.25
}

<小时>

对于旧版本,您需要先应用 ratings 数组字段上的 $unwind 运算符首先作为您的初始聚合管道步骤.这将从输入文档中解构 ratings 数组字段以输出每个元素的文档.每个输出文档都用元素值替换数组.


With older versions, you would need to first apply the $unwind operator on the ratings array field first as your initial aggregation pipeline step. This will deconstruct the ratings array field from the input documents to output a document for each element. Each output document replaces the array with an element value.

第二个管道阶段将是 $group 运算符,它通过 _idtitle 键标识符表达式对输入文档进行分组,并应用所需的 $avg 累加器表达式到计算平均值的每个组.还有另一个累加器运算符 $push 通过返回将表达式应用于上述组中的每个文档所产生的所有值的数组来保留原始 ratings 数组字段.

The second pipeline stage would be the $group operator which groups input documents by the _id and title keys identifier expression and applies the desired $avg accumulator expression to each group that calculates the average. There is another accumulator operator $push that preserves the original ratings array field by returning an array of all values that result from applying an expression to each document in the above group.

最后的管道步骤是$project 运算符,然后重塑流中的每个文档,例如添加新字段 ratings_average.

The final pipeline step is the $project operator which then reshapes each document in the stream, such as by adding the new field ratings_average.

因此,例如,如果您的集合中有一个示例文档(如上和下):

So, if for instance you have a sample document in your collection (as from above and so below):

db.collection.insert({
    "title": "The Hobbit",

    "ratings": [
        {
            "title": "best book ever",
            "rating": 5
        },
        {
            "title": "good book",
            "rating": 3.5
        }
    ]
})

要计算 ratings 数组平均值并将值投影到另一个字段 ratings_average,您可以应用以下聚合管道:

To calculate the ratings array average and projecting the value in another field ratings_average, you can then apply the following aggregation pipeline:

db.collection.aggregate([
    {
        "$unwind": "$ratings"
    },
    {
        "$group": {
            "_id": {
                "_id": "$_id",
                "title": "$title"
            },
            "ratings":{
                "$push": "$ratings"
            },
            "ratings_average": {
                "$avg": "$ratings.rating"
            }
        }
    },
    {
        "$project": {
            "_id": 0,
            "title": "$_id.title",
            "ratings_average": 1,
            "ratings": 1
        }
    }
])

结果:

/* 1 */
{
    "result" : [ 
        {
            "ratings" : [ 
                {
                    "title" : "best book ever",
                    "rating" : 5
                }, 
                {
                    "title" : "good book",
                    "rating" : 3.5
                }
            ],
            "ratings_average" : 4.25,
            "title" : "The Hobbit"
        }
    ],
    "ok" : 1
}

这篇关于计算嵌入文档/数组中字段的平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆