OCR:加权Levenshtein距离 [英] OCR: weighted Levenshtein distance

查看:143
本文介绍了OCR:加权Levenshtein距离的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试用字典创建一个光学字符识别系统.

I'm trying to create an optical character recognition system with the dictionary.

实际上我还没有实现的字典=)

In fact I don't have an implemented dictionary yet=)

我听说有一些基于Levenstein距离的简单度量标准,其中考虑了不同符号之间的不同距离.例如. 'N'和'H'彼此非常接近,并且d("THEATRE","TNEATRE")应当小于d("THEATRE","TOEATRE"),使用基本的Levenstein距离是不可能的.

I've heard that there are simple metrics based on Levenstein distance which take in account different distance between different symbols. E.g. 'N' and 'H' are very close to each other and d("THEATRE", "TNEATRE") should be less than d("THEATRE", "TOEATRE") which is impossible using basic Levenstein distance.

请帮我找到这样的指标.

Could you help me locating such metric, please.

推荐答案

这可能是您正在寻找的内容:

This might be what you are looking for: http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance (and kindly some working code is included in the link)

更新:

http://nlp.stanford.edu/IR-book/html/htmledition/edit-distance-1.html

这篇关于OCR:加权Levenshtein距离的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆