文本比较算法 [英] Text comparison algorithm

查看:114
本文介绍了文本比较算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一个要求,在项目中,我们必须比较两个法律文本(update1,update2),并提出了一个算法来定义多少字和多少句话改变了。

We have a requirement in the project that we have to compare two legal texts ( update1, update2) and come up with an algorithm to define how many words and how many sentences have changed.

有没有可以使用它的算法?我甚至不是寻找代码。如果我知道算法,我可以在java中编码。谢谢。

Are there any algorithms that I can use it? I am not even looking for code. If I know the algorithm, I can code it in java. Thank you.

推荐答案

通常这是通过查找最长公共子序列(通常称为LCS问题)。这是 diff 工具的工作原理。当然, diff 是一个面向行的工具,它听起来像你的需求有些不同。但是,我假设你已经建立了一些方法来比较单词和句子。

Typically this is accomplished by finding the Longest Common Subsequence (commonly called the LCS problem). This is how tools like diff work. Of course, diff is a line-oriented tool, and it sounds like your needs are somewhat different. However, I'm assuming that you've already constructed some way to compare words and sentences.

这篇关于文本比较算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆