堆栈溢出相关问题算法 [英] Stack Overflow Related questions algorithm

查看:129
本文介绍了堆栈溢出相关问题算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在输入标题后出现的相关问题,以及在查看问题时出现在右侧栏中的问题似乎暗示了非常恰当的问题。

堆栈溢出Spolsky在一次谈话中表示,只对SQL进行搜索并且不使用特殊算法。



在这种情况下,有什么算法可以给出好的答案。
在这种情况下,U如何进行数据库搜索?使标题可以搜索并搜索关键词或搜索标签和那些顶部有很多投票的问题? 解决方案

相关问题侧边栏将建立在每个问题的标签上(可能通过基于标签重叠对它们进行排名,所以常见的5个标签>常见的4个标签等)。

其余部分将基于适合自然语言处理的启发式算法。这些在通用目的语言中通常不是很好,但是一旦将词汇缩减到单个技术领域(如编程),其中大部分都非常好。


The related questions that appear after entering the title, and those that are in the right side bar when viewing a question seem to suggest very apt questions.

Stack Overflow only does a SQL search for it and uses no special algorithms, said Spolsky in a talk.

What algorithms exist to give good answers in such a case. How do U do database search in such a case? Make the title searchable and search on the keywords or search on tags and those questions with many votes on top?

解决方案

The related questions sidebar will be building on the tags for each question (probably by ranking them based on tag overlap, so 5 tags in common > 4 tags in common etc).

The rest will be building on heuristics and algorithms suitable for natural language processing. These aren't normally very good in general purpose language, but most of them are VERY good once the vocabulary is reduced down to a single technical area such as programming.

这篇关于堆栈溢出相关问题算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆