字符识别(OCR算法) [英] Character recognition (OCR algorithm)

查看:541
本文介绍了字符识别(OCR算法)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在一个项目中,我必须开发OCR算法(我必须从Image中读取文本,然后将其转换为其他语言).因此,我的首要任务是从image中获取文本.

I am working on a project in which I have to develop OCR Algorithm ( I have to read the text from Image and then convert it to different language ).So my first task is to get text from image.

完成第一个任务的步骤.

Steps to complete first task.

  1. 从给定源加载任何图像格式(bmp,jpg,png).然后将图像转换为灰度并使用阈值(Otsu算法)将其二值化. //完成(如何从输出图像中消除噪音???)

结果

  1. 检测图像特征,例如分辨率和反演.这样我们最终可以将其转换为拉直的图像以进行进一步处理. (完成了Image的旋转代码,但无法检测到必须旋转Image的Image angle,因此仍在角度检测部分上工作)

  1. Detecting image features like resolution and inversion. So that we can finally convert it to a straightened image for further processing. (completed the code of rotation of Image but not able to detect Image angle about which we have to rotate the Image,So still working on angle detection part)

行检测和删除.需要执行此步骤来改进页面布局分析,提高带下划线的文本的识别质量,检测表格等.(决定在End中完成该部分)

Lines detection and removing. This step is required to improve page layout analysis, to achieve better recognition quality for underlined text, to detect tables, etc.(Decided To Complete that part in End)

页面布局分析.在此步骤中,我试图确定图像中存在的文本区域.这样一来,只有该部分用于识别,其余的区域则被省去了.

Page layout analysis. In this step I am trying to identify the text zones present in the image. So that only that portion is used for recognition and rest of the region is left out.

检测文本行和单词.在这里,我们还需要注意不同的字体大小和单词之间的小间距.

Detection of text lines and words. Here we also need to take care of different font sizes and small spaces between words.

字符识别.这是OCR的主要算法;每个字符的图像都必须转换为适当的字符代码.有时,该算法会为不确定的图像生成多个字符代码.例如,识别"I"字符的图像可以产生"I","|"稍后将选择"1","l"代码和最终字符代码.

Recognition of characters. This is the main algorithm of OCR; an image of every character must be converted to appropriate character code. Sometimes this algorithm produces several character codes for uncertain images. For instance, recognition of the image of "I" character can produce "I", "|" "1", "l" codes and the final character code will be selected later.

将结果保存到选定的输出格式,例如可搜索的PDF,DOC,RTF,TXT.保存原始页面布局很重要:列,字体,颜色,图片,背景等.

Saving results to selected output format, for instance, searchable PDF, DOC, RTF, TXT. It is important to save original page layout: columns, fonts, colors, pictures, background and so on.

所以在第6部分中我需要帮助.我已经完成了行检测部分(从包含n行的段落中获取n张图像),但是卡在下一部分中进行单词和字符识别.如果您知道与OCR和字符识别部分相关的良好链接然后请在这里发布.

So I need help in part6.I have completed line detection part (get n Images from a paragraph containing n lines) but stuck in next part getting words and character recognisation.If you know good links related to OCR and character recognisation part then please post Here.

对于字符识别,我正在考虑使用asprise(Java库) http://asprise.com/product/ocr/index.php?lang = java

For character recognisation I am thinking to use asprise(Java library) http://asprise.com/product/ocr/index.php?lang=java

推荐答案

要检测旋转角度,请使用降噪,请用相同的颜色(没有颜色,使用公差阈值)替换没有相邻像素(北,东,南或西)的任何像素(北,东,南或西).平均的邻居.

For noise reduction, replace any pixel, that does not have a neighbour (north, east, south or west) with the same color (a similar color, using a tolerance threshold), with the average of the neighbours.

搜索垂直白色间隙以进行布局检测.沿垂直间隙切片.对于每个切片,现在搜索水平间隙和切片.如果切片具有相同(相似)的高度,则说明您处于线级.否则,请重复垂直/水平切片,直到只剩下线条为止.然后,最后一步还是垂直切片,为您提供单个字符(在某些情况下为连字).长而窄或短而宽的切片是线.

Search for vertical white gaps for layout detection. Slice along the vertical gap. For each slice, now search horizontal gaps, and slice. If the slices have the same (a similar) height, you are at line level. Otherwise repeat vertical/horizontal slicing, until you only have lines left. The last step then is again a vertical slicing, giving you the single characters (or ligatures in some cases). Long and narrow or short and wide slices are lines.

将字符切片与字符库进行比较.如果性能不是主要考虑因素,请尝试在不同的字体库中查找字符,直到可以识别所使用的字体为止.然后使用该字体进行字符识别.

Compare the character slices with a character library. If performance is not the main concern, try to find the characters within different font libraries, until you can identify the font used. Then stick with that font for character recognition.

在原始图像中,将每个字符替换为背景颜色,这是通过为字符的每个像素内插不属于字符的像素而确定的.这会为您提供背景图片(如果有).

In the original image, replace each character with the background color, which is determined by interpolating pixels that not are part of the character for each pixel of the character. This gives you the background image, if any.

这篇关于字符识别(OCR算法)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆