在Mongo中对文档中的子文档字段求平均值 [英] Average a Sub Document Field Across Documents in Mongo
问题描述
对于给定的记录ID,如果我在MongoDB中具有以下内容,如何获得子文档字段的平均值:
For a given record id, how do I get the average of a sub document field if I have the following in MongoDB:
/* 0 */
{
"item" : "1",
"samples" : [
{
"key" : "test-key",
"value" : "1"
},
{
"key" : "test-key2",
"value" : "2"
}
]
}
/* 1 */
{
"item" : "1",
"samples" : [
{
"key" : "test-key",
"value" : "3"
},
{
"key" : "test-key2",
"value" : "4"
}
]
}
我想获取给定项目ID(在本例中为1)的其中key ="test-key"的值的平均值.因此平均值应为$ avg(1 + 3)= 2
I want to get the average of the values where key = "test-key" for a given item id (in this case 1). So the average should be $avg (1 + 3) = 2
谢谢
推荐答案
You'll need to use the aggregation framework. The aggregation will end up looking something like this:
db.stack.aggregate([
{ $match: { "samples.key" : "test-key" } },
{ $unwind : "$samples" },
{ $match : { "samples.key" : "test-key" } },
{ $project : { "new_key" : "$samples.key", "new_value" : "$samples.value" } },
{ $group : { `_id` : "$new_key", answer : { $avg : "$new_value" } } }
])
思考聚合框架的最佳方法就像一条组装线.查询本身是一个JSON文档数组,其中每个子文档代表程序集中的一个不同步骤.
The best way to think of the aggregation framework is like an assembly line. The query itself is an array of JSON documents, where each sub-document represents a different step in the assembly.
第一步是基本的过滤器,例如SQL中的WHERE子句.我们首先放置此步骤,以筛选出不包含包含test-key
的数组元素的所有文档.将其放置在管道的开头,可以使聚合使用索引.
The first step is a basic filter, like a WHERE clause in SQL. We place this step first to filter out all documents that do not contain an array element containing test-key
. Placing this at the beginning of the pipeline allows the aggregation to use indexes.
第二步$unwind
用于分隔样本"数组中的每个元素,因此我们可以对所有元素执行操作.如果仅通过该步骤运行查询,您将明白我的意思.
长话短说:
The second step, $unwind
, is used for separating each of the elements in the "samples" array so we can perform operations across all of them. If you run the query with just that step, you'll see what I mean.
Long story short :
{ name : "bob",
children : [ {"name" : mary}, { "name" : "sue" } ]
}
成为两个文档:
{ name : "bob", children : [ { "name" : mary } ] }
{ name : "bob", children : [ { "name" : sue } ] }
第3步:$ match
第三步$match
与第一阶段$match
完全相同,但用途不同.由于它遵循$unwind
,因此此阶段将筛选出与筛选条件不匹配的先前数组元素(现在为文档).在这种情况下,我们仅保留samples.key = "test-key"
Step 3: $match
The third step, $match
, is an exact duplicate of the first $match
stage, but has a different purpose. Since it follows $unwind
, this stage filters out previous array elements, now documents, that don't match the filter criteria. In this case, we keep only documents where samples.key = "test-key"
第四步,$project
,重组文档.在这种情况下,我将项目从数组中拉出,因此可以直接引用它们.使用上面的示例.
The fourth step, $project
, restructures the document. In this case, I pulled the items out of the array so I could reference them directly. Using the example above..
{ name : "bob", children : [ { "name" : mary } ] }
成为
{ new_name : "bob", new_child_name : mary }
请注意,此步骤完全是可选步骤;稍作更改后,即使没有$project
,也可以完成更高的阶段.在大多数情况下,$project
完全是化妆品.聚合具有很多优化功能,例如可以手动包含或排除$project
应该没有必要.
Note that this step is entirely optional; later stages could be completed even without this $project
after a few minor changes. In most cases $project
is entirely cosmetic; aggregations have numerous optimizations under the hood such that manually including or excluding fields in a $project
should not be necessary.
最后,$group
是发生魔术的地方. _id
值将在SQL世界中分组依据".第二个字段是对我在$project
步骤中定义的值求平均值.您可以轻松地用$sum
代替执行总和,但是计数操作通常通过以下方式完成:my_count : { $sum : 1 }
.
Finally, $group
is where the magic happens. The _id
value what you will be "grouping by" in the SQL world. The second field is saying to average over the value that I defined in the $project
step. You can easily substitute $sum
to perform a sum, but a count operation is typically done the following way: my_count : { $sum : 1 }
.
这里要注意的最重要的事情是,大部分工作是将数据格式化为执行该操作很简单的一点.
The most important thing to note here is that the majority of the work being done is to format the data to a point where performing the operation is simple.
最后,我想指出,由于samples.value
被定义为文本,因此不能 在提供的示例数据上使用,不能在算术运算中使用.如果您有兴趣,请在此处介绍更改字段类型的方法:
Lastly, I wanted to note that this would not work on the example data provided since samples.value
is defined as text, which can't be used in arithmetic operations. If you're interested, changing the type of a field is described here: MongoDB How to change the type of a field
这篇关于在Mongo中对文档中的子文档字段求平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!