MongoDB 按数组内部元素分组 [英] MongoDB group by array inner-elements

查看:34
本文介绍了MongoDB 按数组内部元素分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个文章列表,每篇文章都有一个数组属性,其中列出了其中提到的各个人:

I've got a list of articles, and each of them has an array property which lists various individuals mentioned in them:

_id: {
    $oid: "52b632a9e4f2ba13c82ccd23"
},
providerName: "The Guardian",
url: "http://feeds.theguardian.com/c/34708/f/663860/s/3516cebc/sc/38/l/0L0Stheguardian0N0Cmusic0C20A130Cdec0C220Cwaterboys0Efishermans0Eblues0Etour0Ehammersmith/story01.htm",
subject: "The Waterboys – review",
class_artist: [
    "paul mccartney"
]

我一直在尝试(未成功)根据他们在过去 7 天内被标记的文章数量来获取所有艺术家 (class_artist) 的列表.

I've been trying (unsuccessfully) to get a list of all the individual artists (class_artist), based on the number of articles they've been tagged in within the past 7 days.

我已经做到了:

var date = new Date();
date.setDate(date.getDate() - 7);

db.articles.group({
    key: { class_artist: 1 },
    cond: { class_date: { $gt: date } },
    reduce: function ( curr, result ) { result.cnt++; },
    initial: { cnt : 0 }
}).sort({cnt: -1});

但不幸的是,它不是根据单个数组值计算它们,而是根据数组组合(即艺术家列表)计算它们.

But unfortunately, it doesn't count them based on the individual array values, but by array compositions (that is, lists of artists).

我尝试使用 $unwind 函数,但无法使其工作.

I tried using the $unwind function, but have not been able to make it work.

推荐答案

你使用的是什么框架?这不是 MongoDB shell,看起来像一些围绕 MapReduce 的奇怪包装.在这种情况下,$unwind 将不可用,您需要在 $unwinda href="http://docs.mongodb.org/manual/core/aggregation-pipeline/">聚合框架.这是你在 mongo shell 中想要的:

What framework are you using? This is not MongoDB shell and looks like some weird wrapper around MapReduce. In that case $unwind would not be available, and you need it for user in the aggregation framework. Here's what you want in the mongo shell:

db.articles.aggregate([
  {$match: { class_date: { $gte: date } } },
  {$project: { _id: 0, class_artist: 1 } },
  {$unwind: "$class_artist" },
  {$group: { _id: "$class_artist", tags: { $sum: 1 } }},
  {$project: { _id: 0,class_artist: "$_id", tags: 1 } },
  {$sort: { tags: -1 } }
])

如此高效:

  1. 按日期过滤,因为您已经为过去 7 天设置了 var莉>
  2. 项目只有我们需要的字段{我们只需要一个!}
  3. 展开 数组,这样我们现在每个文档中的每个数组元素都有一个记录
  4. 从扩展文档中对艺术家进行分组
  5. 投影到一种文档格式中,您可以将其作为一组乱七八糟的 _id 使用
  6. 排序结果以相反的顺序查看最先标记的顶部
  1. Filter by date because you already set a var for the last 7 days
  2. Project only the field(s) we need { We need only one! }
  3. Unwind the array so we now have a record for every array element in every document
  4. Group on the Artist from the expanded documents
  5. Project into a document format you can use as group messed around with _id
  6. Sort the results in reverse order to see the top tagged first

聚合的好处在于你可以逐渐建立这些阶段,看看发生了什么.

And the great thing about aggregation is you can gradually build up those stages to see what is going on.

根据需要摇一摇并烘焙到您自己的驱动程序实现或 ODM 框架中.

Shake and bake into your own driver implmentation or ODM framework as required.

这篇关于MongoDB 按数组内部元素分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆