mongodb聚合框架组+项目 [英] mongodb aggregation framework group + project

查看:82
本文介绍了mongodb聚合框架组+项目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下问题:

此查询返回1个结果,这是我想要的:

> db.items.aggregate([ {$group: { "_id": "$id", version: { $max: "$version" } } }])

{
"result" : [
    {
        "_id" : "b91e51e9-6317-4030-a9a6-e7f71d0f2161",
        "version" : 1.2000000000000002
    }
],
"ok" : 1
}

此查询(我刚刚添加了投影,以便以后可以查询整个文档)返回多个结果.我在做什么错了?

> db.items.aggregate([ {$group: { "_id": "$id", version: { $max: "$version" } }, $project: { _id : 1 } }])

  {
"result" : [
    {
        "_id" : ObjectId("5139310a3899d457ee000003")
    },
    {
        "_id" : ObjectId("513931053899d457ee000002")
    },
    {
        "_id" : ObjectId("513930fd3899d457ee000001")
    }
],
"ok" : 1
}

解决方案

$project阶段,并非所有累加器都可用.我们需要考虑我们在项目中可以针对累加器做些什么,以及我们可以在小组中做些什么.我们来看一下:


db.companies.aggregate([{
  $match: {
    funding_rounds: {
      $ne: []
    }
  }
}, {
  $unwind: "$funding_rounds"
}, {
  $sort: {
    "funding_rounds.funded_year": 1,
    "funding_rounds.funded_month": 1,
    "funding_rounds.funded_day": 1
  }
}, {
  $group: {
    _id: {
      company: "$name"
    },
    funding: {
      $push: {
        amount: "$funding_rounds.raised_amount",
        year: "$funding_rounds.funded_year"
      }
    }
  }
}, ]).pretty()

MongoDB中的$ group

我们要检查任何funding_rounds是否不为空的地方.然后,它被unwind -ed到$sort及以后的阶段.我们将为每个公司的funding_rounds数组的每个元素看到一个文档.因此,我们要做的第一件事是基于以下内容$sort:

  1. funding_rounds.funded_year
  2. funding_rounds.funded_month
  3. funding_rounds.funded_day

在按公司名称分组的阶段中,使用$push构建阵列. $push应该是文档的一部分,该文档被指定为我们在分组阶段中命名的字段的值.我们可以推送任何有效的表达式.在这种情况下,我们会将文档推送到该数组,对于我们推送的每个文档,文档都会被添加到要累积的数组的末尾.在这种情况下,我们要处理从raised_amountfunded_year构建的文档.因此,$group阶段是包含_id的文档流,我们在其中指定公司名称.

注意$push$group阶段可用,但在$project阶段不可用.这是因为$group阶段被设计为获取一系列文档并根据该文档流累积值.

另一方面,

$project一次处理一个文档.因此,我们可以在项目阶段内的单个文档中的数组上计算平均值.但是这样做是一次一次,我们正在查看文档,对于每个文档,它都要经过group阶段并推入一个新值,这正是$project阶段并非旨在执行的操作.对于这种类型的操作,我们要使用$group.

让我们看看另一个例子:


db.companies.aggregate([{
  $match: {
    funding_rounds: {
      $exists: true,
      $ne: []
    }
  }
}, {
  $unwind: "$funding_rounds"
}, {
  $sort: {
    "funding_rounds.funded_year": 1,
    "funding_rounds.funded_month": 1,
    "funding_rounds.funded_day": 1
  }
}, {
  $group: {
    _id: {
      company: "$name"
    },
    first_round: {
      $first: "$funding_rounds"
    },
    last_round: {
      $last: "$funding_rounds"
    },
    num_rounds: {
      $sum: 1
    },
    total_raised: {
      $sum: "$funding_rounds.raised_amount"
    }
  }
}, {
  $project: {
    _id: 0,
    company: "$_id.company",
    first_round: {
      amount: "$first_round.raised_amount",
      article: "$first_round.source_url",
      year: "$first_round.funded_year"
    },
    last_round: {
      amount: "$last_round.raised_amount",
      article: "$last_round.source_url",
      year: "$last_round.funded_year"
    },
    num_rounds: 1,
    total_raised: 1,
  }
}, {
  $sort: {
    total_raised: -1
  }
}]).pretty()

组与MongoDB中项目"

$group阶段,我们使用$first$last累加器.是的,再次可以看到,与$push一样-在项目阶段不能使用$first$last.同样,因为项目阶段并非旨在基于多个文档来累积值.相反,它们旨在一次重塑一个文档.使用$sum运算符计算回合总数.值 1 只是计算通过该组的文档数量以及与给定的_id值匹配或分组的每个文档.该项目可能看起来很复杂,但是只是使输出漂亮.只是其中包括上一个文档中的num_roundstotal_raised.

I have the following issue:

this query return 1 result which is what I want:

> db.items.aggregate([ {$group: { "_id": "$id", version: { $max: "$version" } } }])

{
"result" : [
    {
        "_id" : "b91e51e9-6317-4030-a9a6-e7f71d0f2161",
        "version" : 1.2000000000000002
    }
],
"ok" : 1
}

this query ( I just added projection so I can later query for the entire document) return multiple results. What am I doing wrong?

> db.items.aggregate([ {$group: { "_id": "$id", version: { $max: "$version" } }, $project: { _id : 1 } }])

  {
"result" : [
    {
        "_id" : ObjectId("5139310a3899d457ee000003")
    },
    {
        "_id" : ObjectId("513931053899d457ee000002")
    },
    {
        "_id" : ObjectId("513930fd3899d457ee000001")
    }
],
"ok" : 1
}

解决方案

Not all accumulators are available in $project stage. We need to consider what we can do in project with respect to accumulators and what we can do in group. Let's take a look at this:


db.companies.aggregate([{
  $match: {
    funding_rounds: {
      $ne: []
    }
  }
}, {
  $unwind: "$funding_rounds"
}, {
  $sort: {
    "funding_rounds.funded_year": 1,
    "funding_rounds.funded_month": 1,
    "funding_rounds.funded_day": 1
  }
}, {
  $group: {
    _id: {
      company: "$name"
    },
    funding: {
      $push: {
        amount: "$funding_rounds.raised_amount",
        year: "$funding_rounds.funded_year"
      }
    }
  }
}, ]).pretty()

Where we're checking if any of the funding_rounds is not empty. Then it's unwind-ed to $sort and to later stages. We'll see one document for each element of the funding_rounds array for every company. So, the first thing we're going to do here is to $sort based on:

  1. funding_rounds.funded_year
  2. funding_rounds.funded_month
  3. funding_rounds.funded_day

In the group stage by company name, the array is getting built using $push. $push is supposed to be part of a document specified as the value for a field we name in a group stage. We can push on any valid expression. In this case, we're pushing on documents to this array and for every document that we push it's being added to the end of the array that we're accumulating. In this case, we're pushing on documents that are built from the raised_amount and funded_year. So, the $group stage is a stream of documents that have an _id where we're specifying the company name.

Notice that $push is available in $group stages but not in $project stage. This is because $group stages are designed to take a sequence of documents and accumulate values based on that stream of documents.

$project on the other hand, works with one document at a time. So, we can calculate an average on an array within an individual document inside a project stage. But doing something like this where one at a time, we're seeing documents and for every document, it passes through the group stage pushing on a new value, well that's something that the $project stage is just not designed to do. For that type of operation we want to use $group.

Let's take a look at another example:


db.companies.aggregate([{
  $match: {
    funding_rounds: {
      $exists: true,
      $ne: []
    }
  }
}, {
  $unwind: "$funding_rounds"
}, {
  $sort: {
    "funding_rounds.funded_year": 1,
    "funding_rounds.funded_month": 1,
    "funding_rounds.funded_day": 1
  }
}, {
  $group: {
    _id: {
      company: "$name"
    },
    first_round: {
      $first: "$funding_rounds"
    },
    last_round: {
      $last: "$funding_rounds"
    },
    num_rounds: {
      $sum: 1
    },
    total_raised: {
      $sum: "$funding_rounds.raised_amount"
    }
  }
}, {
  $project: {
    _id: 0,
    company: "$_id.company",
    first_round: {
      amount: "$first_round.raised_amount",
      article: "$first_round.source_url",
      year: "$first_round.funded_year"
    },
    last_round: {
      amount: "$last_round.raised_amount",
      article: "$last_round.source_url",
      year: "$last_round.funded_year"
    },
    num_rounds: 1,
    total_raised: 1,
  }
}, {
  $sort: {
    total_raised: -1
  }
}]).pretty()

In the $group stage, we're using $first and $last accumulators. Right, again we can see that as with $push - we can't use $first and $last in project stages. Because again, project stages are not designed to accumulate values based on multiple documents. Rather they're designed to reshape documents one at a time. Total number of rounds is calculated using the $sum operator. The value 1 simply counts the number of documents passed through that group together with each document that matches or is grouped under a given _id value. The project may seem complex, but it's just making the output pretty. It's just that it's including num_rounds and total_raised from the previous document.

这篇关于mongodb聚合框架组+项目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆