如何建立一个“相关问题"引擎? [英] How to build a 'related questions' engine?

查看:66
本文介绍了如何建立一个“相关问题"引擎?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们的一个较大的站点中有一个部分,用户可以在其中向网站所有者发送问题,该问题由其工作人员亲自评估. 当经常出现相同的问题时,他们可以将此特定问题添加到常见问题解答中.

One of our bigger sites has a section where users can send questions to the website owner which get evaluated personally by his staff. When the same question pops up very often they can add this particular question to the Faq.

为了防止他们每天收到数十个类似的问题,我们希望提供与本网站上的相关问题"类似的功能(堆栈溢出).

In order to prevent them from receiving dozens of similar questions a day we would like to provide a feature similar to the 'Related questions' on this site (stack overflow).

有什么方法可以构建这种功能? 我知道我应该以某种方式评估问题并将其与常见问题解答进行比较,但是这种比较是如何进行的?是否提取了关键字?如果是,怎么提取?

What ways are there to build this kind of feature? I know that i should somehow evaluate the question and compare it to the questions in the faq but how does this comparison work? Are keywords extracted and if so how?

也许值得一提的是,该站点是建立在LAMP堆栈上的,因此这些都是可用的技术.

Might be worth mentioning this site is built on the LAMP stack thus these are the technologies available.

谢谢!

推荐答案

我不知道Stack Overflow的工作原理,但我想它使用标记来查找相关问题.例如,在该问题上,与之相关的前几个问题都带有标记recommendation-engine.我猜想稀有标签上的匹配比普通标签上的匹配更重要.

I don't know how Stack Overflow works, but I guess that it uses the tags to find related questions. For example, on this question the top few related questions all have the tag recommendation-engine. I would guess that the matches on rarer tags count for more than matches on common tags.

您可能还希望查看术语频率-反向文档频率.

这篇关于如何建立一个“相关问题"引擎?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆