使用列表中的标签获取文档,按匹配的总数排序 [英] Get documents with tags in list, ordered by total number of matches
问题描述
给定以下MongoDB文档集合:
{
title:'shirt one'
tags:[
'shirt',
'cotton',
't-shirt',
'black'
]
},
{
title:'shirt two'
tags:[
'shirt',
'white',
'button down collar'
]
},
{
title:'shirt three'
标签:[
'shirt',
'cotton',
'red'
]
},
...
检索与标签列表匹配的项目列表,按匹配标签的总数排序?例如,将此标签列表作为输入:
['shirt','cotton','black']
我想要检索按匹配标签总数排序的项目:
项目总计匹配
-------- --------------
Shirt One 3(匹配衬衫+棉花+黑色)
衬衫三2(匹配衬衫+棉花)
衬衫2 1(匹配衬衫)
在关系模式中,标签将是一个单独的表,您可以对该表进行连接,计数匹配,并按计数排序。
但是,在Mongo ...?
看起来这种方法可以工作,
- 将输入标签分成多个IN语句
- 通过OR将标签输入一起查询项目
- ie例如,这将返回衬衫一的三个实例,衬衫一的两个实例,衬衫一的两个实例, Shirt Three等。
- map / reduce输出
< (this._id,{...}); - reduce:计算总出现次数_id
- >
但我不清楚如何将其实现为Mongo查询,这是最高效的方法。
这可能使用聚合框架。
假设
-
/ p>
这种方法迫使你解开结果,并重新评估带有解卷结果的匹配谓词,所以它真的效率很低。
db.test_col.aggregate(
{$ match:{tags:{$ in:[ 衬衫,棉花,黑色]}}},
{$ unwind:$ tags},
{$ match:{tags:{$ in:[shirt, cotton,black]}}},
{$ group:{
_id:{_ id:1},
matches:{$ sum:1}
}},
{$ sort:{matches:-1}}
);
预期结果
{
result:[
{
_id:{
_id:ObjectId(5051f1786a64bd2c54918b26)
},
matches:3
},
{
_id:{
_id:ObjectId(5051f1726a64bd2c54918b24)
},
matches:2
},
{
_id:{
_id:ObjectId(5051f1756a64bd2c54918b25)
},
matches:1
}
],
ok:1
}
Given the following MongoDB collection of documents :
{ title : 'shirt one' tags : [ 'shirt', 'cotton', 't-shirt', 'black' ] }, { title : 'shirt two' tags : [ 'shirt', 'white', 'button down collar' ] }, { title : 'shirt three' tags : [ 'shirt', 'cotton', 'red' ] }, ...
How do you retrieve a list of items matching a list of tags, ordered by the total number of matched tags? For example, given this list of tags as input:
['shirt', 'cotton', 'black']
I'd want to retrieve the items ranked in desc order by total number of matching tags:
item total matches -------- -------------- Shirt One 3 (matched shirt + cotton + black) Shirt Three 2 (matched shirt + cotton) Shirt Two 1 (matched shirt)
In a relational schema, tags would be a separate table, and you could join against that table, count the matches, and order by the count.
But, in Mongo... ?
Seems this approach could work,
- break the input tags into multiple "IN" statements
- query for items by "OR"'ing together the tag inputs
- i.e. where ( 'shirt' IN items.tags ) OR ( 'cotton' IN items.tags )
- this would return, for example, three instances of "Shirt One", 2 instances of "Shirt Three", etc
- map/reduce that output
- map: emit(this._id, {...});
- reduce: count total occurrences of _id
- finalize: sort by counted total
But I'm not clear on how to implement this as a Mongo query, or if this is even the most efficient approach.
解决方案As i answered in In MongoDB search in an array and sort by number of matches
It's possible using Aggregation Framework.
Assumptions
tags
attribute is a set (no repeated elements)
Query
This approach forces you to unwind the results and reevaluate the match predicate with unwinded results, so its really inefficient.
db.test_col.aggregate( {$match: {tags: {$in: ["shirt","cotton","black"]}}}, {$unwind: "$tags"}, {$match: {tags: {$in: ["shirt","cotton","black"]}}}, {$group: { _id:{"_id":1}, matches:{$sum:1} }}, {$sort:{matches:-1}} );
Expected Results
{ "result" : [ { "_id" : { "_id" : ObjectId("5051f1786a64bd2c54918b26") }, "matches" : 3 }, { "_id" : { "_id" : ObjectId("5051f1726a64bd2c54918b24") }, "matches" : 2 }, { "_id" : { "_id" : ObjectId("5051f1756a64bd2c54918b25") }, "matches" : 1 } ], "ok" : 1 }
这篇关于使用列表中的标签获取文档,按匹配的总数排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!