用于文本识别的图像预处理 [英] Image preprocessing for text recognition

查看:204
本文介绍了用于文本识别的图像预处理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在EmguCV中应用于图像进行文本识别的最佳图像预处理操作是什么?

What's the best set of image preprocessing operations to apply to images for text recognition in EmguCV?

我在这里包含了两个示例图像。

I've included two sample images here.

应用低通或高通滤波器不合适,因为文本可能是任何大小。我尝试过中值和双边滤镜,但它们似乎对图像的影响不大。

Applying a low or high pass filter won't be suitable, as the text may be of any size. I've tried median and bilateral filters, but they don't seem to affect the image much.

理想的结果是所有文字都是白色的二进制图像,其余大部分是黑色的。然后该图像将被发送到OCR引擎。

The ideal result would be a binary image with all the text white, and most of the rest black. This image would then be sent to the OCR engine.

谢谢

推荐答案

没有最好的设置。请记住,数字图像可以通过不同的捕获设备获取,每个设备都可以嵌入自己的预处理系统(过滤器)和其他可以彻底改变图像甚至为它们添加噪声的特性。所以每个案例都必须以不同方式处理(预处理)。

There's nothing like the best set. Keep in mind that digital images can be acquired by different capture devices and each device can embed its own preprocessing system (filters) and other characteristics that can drastically change the image and even add noises to them. So every case would have to be treated (preprocessed) differently.

然而,有通用操作可以用来改善检测,例如,一个非常基本的方法是将图像转换为灰度并应用阈值以二值化图像。我之前使用的另一种技术是边界框,它允许您检测文本区域。要从图像中去除噪声,您可能会对侵蚀/扩张操作感兴趣。我在这篇文章上演示了一些这些操作。

However, there are commmon operations that can be used to improve the detection, for instance, a very basic one would be to convert the image to grayscale and apply a threshold to binarize the image. Another technique I've used before is the bounding box, which allows you to detect the text region. To remove noises from images you might be interested in erode/dilate operations. I demonstrate some of these operations on this post.

此外,还有其他有关OCR和OpenCV的有趣帖子,你应该看看:

Also, there are other interesting posts about OCR and OpenCV that you should take a look:


  • < a href =https://stackoverflow.com/a/9620295/176769> OpenCV-Python中的简单数字识别OCR

  • OpenCV中的基本OCR

  • Simple Digit Recognition OCR in OpenCV-Python
  • Basic OCR in OpenCV

现在,只是为了向您展示可以与样本图像一起使用的简单方法,这是反转颜色并应用阈值的结果:

Now, just to show you a simple approach that can be used with your sample image, this is the result of inverting the color and applying a threshold:

cv::Mat new_img = cv::imread(argv[1]);
cv::bitwise_not(new_img, new_img);

double thres = 100;
double color = 255;
cv::threshold(new_img, new_img, thres, color, CV_THRESH_BINARY);

cv::imwrite("inv_thres.png", new_img);

这篇关于用于文本识别的图像预处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆