使用列表中的标签获取文档,按匹配的总数排序 [英] Get documents with tags in list, ordered by total number of matches

查看:194
本文介绍了使用列表中的标签获取文档,按匹配的总数排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定以下MongoDB文档集合:

  {
title:'shirt one'
tags:[
'shirt',
'cotton',
't-shirt',
'black'
]
},
{
title:'shirt two'
tags:[
'shirt',
'white',
'button down collar'
]
},
{
title:'shirt three'
标签:[
'shirt',
'cotton',
'red'
]
},
...

检索与标签列表匹配的项目列表,按匹配标签的总数排序?例如,将此标签列表作为输入:

  ['shirt','cotton','black'] 

我想要检索按匹配标签总数排序的项目:

 项目总计匹配
-------- --------------
Shirt One 3(匹配衬衫+棉花+黑色)
衬衫三2(匹配衬衫+棉花)
衬衫2 1(匹配衬衫)

在关系模式中,标签将是一个单独的表,您可以对该表进行连接,计数匹配,并按计数排序。



但是,在Mongo ...?



看起来这种方法可以工作,




  • 将输入标签分成多个IN语句

  • 通过OR将标签输入一起查询项目

    • ie例如,这将返回衬衫一的三个实例,衬衫一的两个实例,衬衫一的两个实例, Shirt Three等。


  • map / reduce输出

      < (this._id,{...});
    • reduce:计算总出现次数_id

    • >



但我不清楚如何将其实现为Mongo查询,这是最高效的方法。

解决方案

正如我在在MongoDB中搜索数组并按匹配数排序



这可能使用聚合框架。



假设




  • / p>

    这种方法迫使你解开结果,并重新评估带有解卷结果的匹配谓词,所以它真的效率很低。

      db.test_col.aggregate(
    {$ match:{tags:{$ in:[ 衬衫,棉花,黑色]}}},
    {$ unwind:$ tags},
    {$ match:{tags:{$ in:[shirt, cotton,black]}}},
    {$ group:{
    _id:{_ id:1},
    matches:{$ sum:1}
    }},
    {$ sort:{matches:-1}}
    );

    预期结果

      {
    result:[
    {
    _id:{
    _id:ObjectId(5051f1786a64bd2c54918b26)
    },
    matches:3
    },
    {
    _id:{
    _id:ObjectId(5051f1726a64bd2c54918b24)
    },
    matches:2
    },
    {
    _id:{
    _id:ObjectId(5051f1756a64bd2c54918b25)
    },
    matches:1
    }
    ],
    ok:1
    }


    Given the following MongoDB collection of documents :

    {
     title : 'shirt one'
     tags : [
      'shirt',
      'cotton',
      't-shirt',
      'black'
     ]
    },
    {
     title : 'shirt two'
     tags : [
      'shirt',
      'white',
      'button down collar'
     ]
    },
    {
     title : 'shirt three'
     tags : [
      'shirt',
      'cotton',
      'red'
     ]
    },
    ...
    

    How do you retrieve a list of items matching a list of tags, ordered by the total number of matched tags? For example, given this list of tags as input:

    ['shirt', 'cotton', 'black']
    

    I'd want to retrieve the items ranked in desc order by total number of matching tags:

    item          total matches
    --------      --------------
    Shirt One     3 (matched shirt + cotton + black)
    Shirt Three   2 (matched shirt + cotton)
    Shirt Two     1 (matched shirt)
    

    In a relational schema, tags would be a separate table, and you could join against that table, count the matches, and order by the count.

    But, in Mongo... ?

    Seems this approach could work,

    • break the input tags into multiple "IN" statements
    • query for items by "OR"'ing together the tag inputs
      • i.e. where ( 'shirt' IN items.tags ) OR ( 'cotton' IN items.tags )
      • this would return, for example, three instances of "Shirt One", 2 instances of "Shirt Three", etc
    • map/reduce that output
      • map: emit(this._id, {...});
      • reduce: count total occurrences of _id
      • finalize: sort by counted total

    But I'm not clear on how to implement this as a Mongo query, or if this is even the most efficient approach.

    解决方案

    As i answered in In MongoDB search in an array and sort by number of matches

    It's possible using Aggregation Framework.

    Assumptions

    • tags attribute is a set (no repeated elements)

    Query

    This approach forces you to unwind the results and reevaluate the match predicate with unwinded results, so its really inefficient.

    db.test_col.aggregate(
        {$match: {tags: {$in: ["shirt","cotton","black"]}}}, 
        {$unwind: "$tags"}, 
        {$match: {tags: {$in: ["shirt","cotton","black"]}}},
        {$group: {
            _id:{"_id":1}, 
            matches:{$sum:1}
        }}, 
        {$sort:{matches:-1}}
    );
    

    Expected Results

    {
        "result" : [
            {
                "_id" : {
                    "_id" : ObjectId("5051f1786a64bd2c54918b26")
                },
                "matches" : 3
            },
            {
                "_id" : {
                    "_id" : ObjectId("5051f1726a64bd2c54918b24")
                },
                "matches" : 2
            },
            {
                "_id" : {
                    "_id" : ObjectId("5051f1756a64bd2c54918b25")
                },
                "matches" : 1
            }
        ],
        "ok" : 1
    }
    

    这篇关于使用列表中的标签获取文档,按匹配的总数排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆