OCR和字符相似 [英] OCR and character similarity

查看:225
本文介绍了OCR和字符相似的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在研究某种OCR(光学字符识别)系统.我已经写了一个脚本来提取文本中的每个字符并清除其中的大部分不规则性.我也知道字体.例如,我现在拥有的图像是:

I am currently working on some kind of OCR (Optical Character Recognition) system. I have already written a script to extract each character from the text and clean (most of the) irregularities out of it. I also know the font. The images I have now for example are:

M( http://i.imgur.com/oRfSOsJ.png (字体)和 http://i.imgur.com/UDEJZyV.png (已扫描))

M (http://i.imgur.com/oRfSOsJ.png (font) and http://i.imgur.com/UDEJZyV.png (scanned))

K( http://i.imgur.com/PluXtDz.png (字体)和 http://i.imgur.com/TRuDXSx.png (已扫描))

K (http://i.imgur.com/PluXtDz.png (font) and http://i.imgur.com/TRuDXSx.png (scanned))

C( http://i.imgur.com/wggsX6M.png (字体)和 http://i.imgur.com/GF9vClh.png (已扫描))

C (http://i.imgur.com/wggsX6M.png (font) and http://i.imgur.com/GF9vClh.png (scanned))

对于所有这些图像,我已经有了一种二进制矩阵(黑色代表1,白色代表0).我现在想知道是否存在某种类似于数学投影的公式来查看这些矩阵之间的相似性.我不想依靠图书馆,因为那不是我的任务.

For all of these images I already have a sort of binary matrix (1 for black, 0 for white). I was now wondering if there was some kind of mathematical projection-like formula to see the similarity between these matrices. I do not want to rely on a library, because that was not the task given to me.

我知道这个问题似乎有点含糊,也有类似的问题,但是我在寻找方法,而不是包装,到目前为止,我找不到关于该方法的任何注释.这个问题含糊不清的原因是我真的没有起点.我想做的实际上是在Wikipedia上描述的:

I know this question may seem a bit vague and there are similar questions, but I'm looking for the method, not for a package and so far I couldn't find any comments regarding the method. The reason this question being vague is that I really have no point to start. What I want to do is actually described here on wikipedia:

矩阵匹配包括将图像与存储的字形逐像素进行比较;它也被称为模式匹配"或模式识别".[9]这取决于将输入字形与图像的其余部分正确隔离,并且取决于存储的字形具有相似的字体和相同的比例.此技术最适合打字文本,当遇到新字体时效果不佳. ( http: //en.wikipedia.org/wiki/Optical_character_recognition#Character_recognition )

如果有人可以帮助我解决这个问题,我将非常感激.

If anyone could help me out on this one, I would appreciate it very much.

推荐答案

大多数OCR使用神经网络进行识别或分类

这些必须正确配置为所需的任务,例如内部互连体系结构的层数等等.神经网络的另一个问题是必须对它们进行适当的训练,这很难正确地进行,因为您将需要了解诸如适当的训练数据集大小之类的信息(因此,它包含足够的信息并且不会对其进行过度训练).如果您没有使用神经网络的经验,那么如果您需要自己实施它,那就不要这样!

These must be properly configured to desired task like number of layers internal interconnection architecture , and so on. Also problem with neural networks is that they must be properly trained which is pretty hard to do properly because you will need to know for that things like proper training dataset size (so it contains enough information and do not over-train it). If you do not have experience with neural networks do not go this way if you need to implement it yourself !!!

还有其他比较模式的方法

There are also other ways to compare patterns

  1. 矢量方法

  • 多边形化图像(边缘或边框)
  • 比较多边形相似度(表面积,周长,形状,....)

像素方法

您可以根据以下条件比较图像:

You can compare images based on:

  • 直方图
  • DFT/DCT 光谱分析
  • 大小
  • 每行占用的像素数
  • 每行中占用像素的开始位置(从左开始)
  • 每行中占用像素的末端位置(从righ开始)
  • 这3个参数也可以用于行
  • 兴趣点列表(强度变化,边缘等变化的点)
  • histogram
  • DFT/DCT spectral analysis
  • size
  • number of occupied pixels per each line
  • start position of occupied pixel in each line (from left)
  • end position of occupied pixel in each line (from righ)
  • these 3 parameters can be done also for rows
  • points of interest list (points where is some change like intensity bump,edge,...)

您为每个测试的字符创建功能列表,并将其与您的字体进行比较,然后最接近的匹配项就是您的字符.这些功能列表也可以缩放到某个固定大小(例如64x64),因此识别在缩放时变得不变.

You create feature list for each tested character and compare it to your font and then the closest match is your character. Also these feature list can be scaled to some fixed size (like 64x64) so the recognition became invariant on scaling.

以下是我用于 OCR

在这种情况下(功能大小按比例缩放以适合NxN),因此每个字符都有6数组,这些数组由N数字组成,例如:

In this case (the feature size is scaled to fit in NxN) so each character has 6 arrays by N numbers like:

int row_pixels[N]; // 1nd image
int lin_pixels[N]; // 2st image
int row_y0[N];     // 3th image green
int row_y1[N];     // 3th image red
int lin_x0[N];     // 4th image green
int lin_x1[N];     // 4th image red

现在:为字体中的每个字符和每个读取的字符预先计算所有功能.从字体中找到最接近的匹配项

Now: pre-compute all features for each character in your font and for each readed character. Find the most close match from font

  • 所有特征向量/阵列之间的最小距离
  • 不超过某些门槛差异

这在旋转时是局部不变的,并倾斜到一个点.我对填充字符进行了 OCR ,因此对于轮廓字体,它可能需要进行一些调整

This is partialy invariant on rotation and skew up to a point. I do OCR for filled characters so for outlined font it may have use some tweaking

[注释]

为进行比较,您可以使用距离或相关系数

For comparison you can use distance or correlation coefficient

这篇关于OCR和字符相似的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆