算法,基于共同的标签搜索相关项目 [英] Algorithm that searches for related items based on common tags

查看:152
本文介绍了算法,基于共同的标签搜索相关项目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

让我们的StackOverflow问题为例。他们每个人都有分配多个标签。如何建立一个算法,将基于查找相关的问题有多少共同的标签,他们(由通用标签数量排序)?

Lets take StackOverflow questions as example. Each of them has multiple tags assigned. How to build an algorithm that would find related questions based on how many common tags they have (sorted by number of common tags)?

现在,我想不出任何事情不仅仅是选择那些至少有一个共同的标签的所有问题,到一个数组,然后通过他们都分配一些常用标记循环到各项目,然后排序该数组更好。

For now I can't think about anything better than just selecting all questions that have at least one common tag into an array and then looping through them all assigning number of common tags to each item, then sorting this array.

有没有做这件事的更聪明的方法?完美的解决方案将是一个单一的SQL查询。

Is there more clever way of doing it? Perfect solution would be a single sql query.

推荐答案

这可能是一样糟糕,为O(n ^ 2),但它的作品:

This could be as bad as O(n^2), but it works:

create table QuestionTags (questionid int, tag int);

select q1.questionid, q2.questionid, count(*) as commontags
from QuestionTags q1 join QuestionTags q2 
where q1.tag = q2.tag and q1.questionid < q2.questionid
group by q1.questionid, q2.questionid order by commontags desc;

这篇关于算法,基于共同的标签搜索相关项目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆