获取列表中带有标签的文档,按匹配总数排序 [英] Get documents with tags in list, ordered by total number of matches
问题描述
鉴于以下 MongoDB 文档集合:
Given the following MongoDB collection of documents :
{
title : 'shirt one'
tags : [
'shirt',
'cotton',
't-shirt',
'black'
]
},
{
title : 'shirt two'
tags : [
'shirt',
'white',
'button down collar'
]
},
{
title : 'shirt three'
tags : [
'shirt',
'cotton',
'red'
]
},
...
如何检索与标签列表匹配的项目列表,按匹配标签的总数排序?例如,给定这个标签列表作为输入:
How do you retrieve a list of items matching a list of tags, ordered by the total number of matched tags? For example, given this list of tags as input:
['shirt', 'cotton', 'black']
我想检索按匹配标签总数按降序排列的项目:
I'd want to retrieve the items ranked in desc order by total number of matching tags:
item total matches
-------- --------------
Shirt One 3 (matched shirt + cotton + black)
Shirt Three 2 (matched shirt + cotton)
Shirt Two 1 (matched shirt)
在关系模式中,标签将是一个单独的表,您可以连接该表,计算匹配项并按计数排序.
In a relational schema, tags would be a separate table, and you could join against that table, count the matches, and order by the count.
但是,在 Mongo 中...?
But, in Mongo... ?
看来这种方法可行,
- 将输入标签分成多个IN"语句
- 通过将标签输入或"在一起来查询项目
- 即where ('shirt' IN items.tags ) OR ('cotton' IN items.tags )
- 例如,这将返回三个衬衫一号"实例、两个衬衫三号"实例等
- 地图:发射(this._id, {...});
- reduce:统计 _id 的总出现次数
- finalize:按总数排序
但我不清楚如何将其实现为 Mongo 查询,或者这是否是最有效的方法.
But I'm not clear on how to implement this as a Mongo query, or if this is even the most efficient approach.
推荐答案
可以使用聚合框架.
假设
tags
属性是一个集合(没有重复的元素)
tags
attribute is a set (no repeated elements)
查询
这种方法迫使您展开结果并使用展开的结果重新评估匹配谓词,因此它的效率非常低.
This approach forces you to unwind the results and reevaluate the match predicate with unwinded results, so its really inefficient.
db.test_col.aggregate( {$match: {tags: {$in: ["shirt","cotton","black"]}}}, {$unwind: "$tags"}, {$match: {tags: {$in: ["shirt","cotton","black"]}}}, {$group: { _id:{"_id":1}, matches:{$sum:1} }}, {$sort:{matches:-1}} );
预期结果
{ "result" : [ { "_id" : { "_id" : ObjectId("5051f1786a64bd2c54918b26") }, "matches" : 3 }, { "_id" : { "_id" : ObjectId("5051f1726a64bd2c54918b24") }, "matches" : 2 }, { "_id" : { "_id" : ObjectId("5051f1756a64bd2c54918b25") }, "matches" : 1 } ], "ok" : 1 }
这篇关于获取列表中带有标签的文档,按匹配总数排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!