文本差分算法 [英] Text difference algorithm

查看：375 发布时间：2016/8/26 20:19:35 c# python diff

本文介绍了文本差分算法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要一个算法，可以比较两个文本文件，并突出自己的差异和（甚至更好！）可以计算他们以有意义的方式的差异（如两个相似的文件应该有一个相似性得分高于两个不同的文件，字类似的在正常条件定义）。这听起来容易实现，但它不是。

I need an algorithm that can compare two text files and highlight their difference and ( even better!) can compute their difference in a meaningful way (like two similar files should have a similarity score higher than two dissimilar files, with the word "similar" defined in the normal terms). It sounds easy to implement, but it's not.

的实施可以在C＃或蟒蛇。

The implementation can be in c# or python.

感谢。

推荐答案

在Python中，有 difflib ，也如其他人所说。

In Python, there is difflib, as also others have suggested.

difflib 提供 SequenceMatcher 类，它可以用来给你一个相似的比例。示例功能：

difflib offers the SequenceMatcher class, which can be used to give you a similarity ratio. Example function:

def text_compare(text1, text2, isjunk=None):
    return difflib.SequenceMatcher(isjunk, text1, text2).ratio()

这篇关于文本差分算法的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

文本差分算法 [英] Text difference algorithm

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

文本差分算法 [英] Text difference algorithm

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭