如何对文档加权以创建排序标准? [英] How to weight documents to create sort criteria?

查看:68
本文介绍了如何对文档加权以创建排序标准?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试汇总一个集合,其中包含如下所示的文档:

I'm trying to aggregate a collection in which there are documents that look like this:

[
  {  
    "title" : 1984,
    "tags" : ['dystopia', apocalypse', 'future',....]
  },
  ....
]

例如,我有一个由关键字组成的条件数组:

And I have a criteria array of keywords, for instance:

var keywords = ['future', 'google', 'cat',....]

我想要实现的是对集合进行汇总,以便根据便利性"标准对它进行分组,以便按照在其标签字段中包含更多关键字的文档对文档进行排序.

What I would like to achieve is to aggregate the collection in order to $group it according to a "convenience" criteria in order to sort the document by the one that contains the more of the keywords in its tags field.

这意味着,如果一个文档的标签中包含"future","google","cat"标签,则该文档将在包含"future","cat"和"apple"的另一个文档之前进行排序.

This means, if one document contains in its tags: 'future', 'google', 'cat' it will be sorted before another one that has 'future', 'cat', 'apple'.

到目前为止,我已经尝试过类似的事情:

So far, I have tried something like this:

db.books.aggregate(
   { $group : { _id : {title:"$title"} , convenience: { $sum: { $cond: [ {tags: {$in: keywords}}, 1, 0 ] } } } },
            { $sort : {'convenience': -1}})

但是$in运算符不是布尔值,因此它不起作用.我环顾四周,没有找到任何可以帮助我解决这个问题的操作员.

But the $in operator is not a boolean so it does not work. I've looked around and didn't find any operator that could help me with this.

推荐答案

正如您所说,您需要逻辑运算符才能评估 $ or :

As you said you need a logical operator in order to evaluate $cond. It's a bit terse, but here is an implementation using $or :

db.books.aggregate([
    {$unwind: "$tags" },
    {$group: {
        _id: "$title",
        weight: {
            $sum: {$cond: [
               // Test *equality* of the `tags` value against any of the list 
               {$or: [
                   {$eq: ["$tags", "future"]},
                   {$eq: ["$tags", "google"]},
                   {$eq: ["$tags", "cat"]},
               ]},
            1, 0 ]}
        }
    }}
])

我将把其余的实现留给您,但这应该显示基本结构,直到您想要进行匹配为止.

I'll leave the rest of the implementation up to you, but this should show the basic construction to the point of the matching you want to do.

从您的评论看来,您似乎还遇到了一个编程难题,与您如何像这样进行汇总有关,您需要以上面给出的形式查询数组个项目:

From your comments there also seem to be a programming issue you are struggling with, related to how you perform an aggregation like this where you have an Array of items to query in the form you gave above:

var keywords = ['future', 'google', 'cat',....]

由于不能在管道条件中直接使用此结构,因此您需要将其转换为所需的内容.每种语言都有其自己的方法,但是在JavaScript版本中:

Since this structure cannot be directly employed in the pipeline condition, what you need to do is transform it into what you need. Each language has it's own approach, but in a JavaScript version:

var keywords = ['future', 'google', 'cat'];
var orCondition = [];

keywords.forEach(function(value) {
    var doc = {$eq: [ "$tags", value ]};
    orCondition.push(doc);
});

然后使用适当的orCondition变量定义聚合查询:

And then just define the aggregation query with the orCondition variable in place:

db.books.aggregate([
    {$unwind: "$tags" },
    {$group: {
        _id: "$title",
        weight: {
            $sum: {$cond: [
               {$or: orCondition }
            1, 0 ]}
        }
    }}
])

或者就此而言,您需要构建的任何部分.通常,这是在现实世界中完成的,我们几乎永远不会像这样对数据结构进行硬编码.

Or for that matter, any of the parts you need to construct. This is generally how it is done in the real world, where we would almost never hard-code a data structure like this.

这篇关于如何对文档加权以创建排序标准?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆