MongoDB按数组内部元素分组 [英] MongoDB group by array inner-elements

查看:17
本文介绍了MongoDB按数组内部元素分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个文章列表,每个文章都有一个数组属性,其中列出了其中提到的各种个人:

I've got a list of articles, and each of them has an array property which lists various individuals mentioned in them:

_id: {
    $oid: "52b632a9e4f2ba13c82ccd23"
},
providerName: "The Guardian",
url: "http://feeds.theguardian.com/c/34708/f/663860/s/3516cebc/sc/38/l/0L0Stheguardian0N0Cmusic0C20A130Cdec0C220Cwaterboys0Efishermans0Eblues0Etour0Ehammersmith/story01.htm",
subject: "The Waterboys – review",
class_artist: [
    "paul mccartney"
]

我一直在尝试(未成功)根据过去 7 天内标记的文章数量获取所有艺术家 (class_artist) 的列表.

I've been trying (unsuccessfully) to get a list of all the individual artists (class_artist), based on the number of articles they've been tagged in within the past 7 days.

我已经做到了:

var date = new Date();
date.setDate(date.getDate() - 7);

db.articles.group({
    key: { class_artist: 1 },
    cond: { class_date: { $gt: date } },
    reduce: function ( curr, result ) { result.cnt++; },
    initial: { cnt : 0 }
}).sort({cnt: -1});

但不幸的是,它不是根据单个数组值计算它们,而是根据数组组合(即艺术家列表).

But unfortunately, it doesn't count them based on the individual array values, but by array compositions (that is, lists of artists).

我尝试使用 $unwind 函数,但未能成功.

I tried using the $unwind function, but have not been able to make it work.

推荐答案

你用的是什么框架?这不是 MongoDB shell,看起来像一些奇怪的 MapReduce 包装器.在这种情况下, $unwind 将不可用,您需要在 $unwinda href="http://docs.mongodb.org/manual/core/aggregation-pipeline/">聚合框架.这是你想要的 mongo shell:

What framework are you using? This is not MongoDB shell and looks like some weird wrapper around MapReduce. In that case $unwind would not be available, and you need it for user in the aggregation framework. Here's what you want in the mongo shell:

db.articles.aggregate([
  {$match: { class_date: { $gte: date } } },
  {$project: { _id: 0, class_artist: 1 } },
  {$unwind: "$class_artist" },
  {$group: { _id: "$class_artist", tags: { $sum: 1 } }},
  {$project: { _id: 0,class_artist: "$_id", tags: 1 } },
  {$sort: { tags: -1 } }
])

如此高效:

  1. 按日期过滤,因为您已经设置了过去 7 天的 var李>
  2. Project 只需要我们需要的字段{ 我们只需要一个!}
  3. 展开数组,所以我们现在有每个文档中每个数组元素的记录
  4. 群组来自扩展文档中的艺术家
  5. 项目成一种文档格式,您可以将其用作与 _id 混淆的组
  6. 对结果进行倒序排序,以先查看标记在顶部的位置
  1. Filter by date because you already set a var for the last 7 days
  2. Project only the field(s) we need { We need only one! }
  3. Unwind the array so we now have a record for every array element in every document
  4. Group on the Artist from the expanded documents
  5. Project into a document format you can use as group messed around with _id
  6. Sort the results in reverse order to see the top tagged first

聚合的好处在于您可以逐步建立这些阶段以查看发生了什么.

And the great thing about aggregation is you can gradually build up those stages to see what is going on.

根据需要摇动并烘焙到您自己的驱动程序实施或 ODM 框架中.

Shake and bake into your own driver implmentation or ODM framework as required.

这篇关于MongoDB按数组内部元素分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆