算法比较英语句子的相似性 [英] Algorithm to compare similarity of English sentences

查看：163 发布时间：2015/11/30 14:31:53 algorithm

本文介绍了算法比较英语句子的相似性的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

我有句子的集合，我需要对它们进行分析，看看他们是如何的相似。

I have a collection of sentences, and I need to analyse them to see how similar they are.

是否有任何既定的算法来做到这一点？

Are there any established algorithms to do this?

我在乎的：

我用Levenshtein距离和正克的拼写之前，虽然我不完全相信，如果这些转化为我的目的。

I've used Levenshtein distance and n-grams for spelling before, although I'm not entirely confident if these translate to my purposes.

天真，我不关心拼写差异，错别字可视为不同的词，但也许这将是很好的考虑到这一点。

Naively, "I don't care about spelling differences, typos can be treated as different words" although perhaps it would be nice to account for this.

拆分句子的空间和上面（或其他方式）的算法之一将是一个起点，也许是一些混合

perhaps some hybrid of splitting the sentence at spaces and one of the above (or other) algorithms would be a starting point

有哪些选项可用？有什么建议？

What options are available? Any advice?

谢谢！