在 OCR 处理之前移除背景颜色或纹理 [英] Remove Background Color or Texture Before OCR Processing

查看:52
本文介绍了在 OCR 处理之前移除背景颜色或纹理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

典型的手机用户在为卡片大小的物体拍照时,图像中通常会包含一些背景纹理——请参考附件示例.在某些情况下,该背景可能会影响 OCR 的准确性.

When a typical mobile phone user takes picture for a card-size object, some background texture is usually included in the image -- please refer to the attached samples. In certain cases, that background could pollute OCR's accuracy.

我想知道是否有解决方案来去除背景(我肯定有),或者检测背景区域,以便可以在 OCR 之前将它们裁剪掉.在附加图像的情况下,木桌和台面展示是被删除的候选对象.我想对比色可能是一种解决方案,但不太确定.

I am wondering that whether there are solutions or not to remove the background (am positive that there are), or detect the background regions so one can just crop them off before OCR. In case of the attached images, wood tables and counter-top presenting are the candidate being removed. I would imagine that contrasting colors could be a solution but not so sure.

推荐答案

在某些情况下,作为人类,您在区分背景和前景时会遇到困难,因此肯定没有方法可以正确地做您想做的事.既然您提到了 OCR,我假设您实际上想要消除非文本的所有内容.这实际上并没有使问题变得更容易,所以我实际上假设的是您想要保留与其他对象高度对比的对象(例如前景和背景,或白色背景上的黑色文本).同样,没有完美的方法.

There are certain cases where you, as a human, have trouble discerning between background and foreground, so certainly there is no method to do correctly what you want. Since you mention OCR, I assume you actually want to eliminate everything that is not text. This doesn't make the question any easier actually, so what I'm actually assuming is that you want to keep objects that are highly contrasted against other objects (like foreground and background, or black text on a white background, for example). Again, there is no perfect method for that.

所以,这个答案要做的就是提供一个简单的方法,可以帮助你完成任务.该方法结合了现成的形态学工具和用于二值化的 Otsu 方法,因为它在统计上是最优的.结果是可能值得关注的区域.请注意,您肯定需要将这些结果与许多其他不同的分析结合起来,一个好的 OCR 系统远远超出这些直接方法.

So, all this answer is going to do is present a simple method that might help you in your task. The method is a combination of ready morphological tools and the Otsu method for binarization since it is statistically optimal. The result are the regions that are potentially worth to look at. Note that you will certainly need to combine these results with many other different analysis, a good OCR system goes much beyond these direct approaches.

方法:1)将图像转换为灰度(对颜色不感兴趣,但不同的方法当然可以使用它们);2)使用h-dome变换去除不相关的最大值;3)计算形态梯度;4)通过大津二值化;5)通过区域开口去除小物件.删除不相关的最大值对您的任务很重要,因为您可能会遇到由坏相机和坏相机闪光灯以及没有经验的摄影师组合造成的非常可怕的区域.H-dome 变换是基于形态重建的,所以如果你的库有后者但没有前者,那么实现它很简单(否则你可以学习如何有效地实现后者).离散图像的形态梯度是一种非常简单的应用方法,即使在光照不佳的情况下也能正常工作,因为它是一种局部方法.Otsu 对其结果的阈值保持最强的边缘(可能包括噪声和其他次要特征).您可以在所有这些之前进行高斯平滑处理,这可以作为抑制噪声的初始工具.小特征很容易通过区域开口去除.在 Matlab 中,可以这样做:

The method: 1) convert the image to grayscale (not interested in the colors, but a different method can certainly use them); 2) Use the h-dome transform to remove irrelevant maxima; 3) Calculate the morphological gradient; 4) Binarize by otsu; 5) Remove small objects by area opening. Removing irrelevant maxima is important for your task since you can have pretty horrible regions caused by a combination of bad camera's with bad camera's flash together with a inexperienced photographer. H-dome transform is based on morphological reconstruction, so if your library has the latter but not the former, it is straightforward to implement it (otherwise you could learn how to efficiently implement the latter). Morphological gradient for discrete images is a very simple method to apply which tends to work fine even with bad illumination, since it is a local method. Threshold on its result by Otsu keeps the strongest edges (which possibly includes noise and other minor features). You could precede all this by a gaussian smoothing, which might serve as an initial tool for noise suppression. The small features are readily removed by area opening. In Matlab, this can be done as in:

f = rgb2gray(imread(yourimage));
se = strel('square', 3);
g = imhmax(f, 50);                    % h-dome with h = 50
g = imdilate(g, se) - imerode(g, se); % morphological gradient
h = im2bw(g, graythresh(g));          % graythresh applies Otsu's method
w = bwareaopen(h, 50);

假设小于 50 像素的对象无关紧要(对于小文本可能并非总是如此).

assuming that objects smaller than 50 pixels are irrelevant (which might not always be the case for small text).

以下是您的示例的 w 图片:

Here are the w images for your examples:

这些输出指示您应该在哪里查找文本,即连接组件的内部.

These outputs give an indication of where you should look for text, i.e., the interior of the connected components.

这篇关于在 OCR 处理之前移除背景颜色或纹理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆