OCR 的字符重建和填充 [英] character reconstruction and filling for OCR

查看：25 发布时间：2021/12/8 14:37:22 matlab image-processing ocr edge-detection

本文介绍了OCR 的字符重建和填充的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在研究轮胎上的文本识别.为了使用OCR，我必须首先得到一个清晰的二进制图.

我已经处理了图像，文本出现边缘破损和不连续的情况.我已经在 MATLAB 中尝试过使用圆盘和线元素进行标准腐蚀/膨胀，但它并没有真正帮助.

Pr1- 关于如何重建这些字符并填补字符笔划之间的空白的任何想法?

Pr2- 上面的图像分辨率更高且光照良好.但是，如果如下图所示，照明较差且分辨率相对较低，那么可行的处理选择是什么?

尝试的解决方案:

S1:这是对Spektre共享的处理过的图像应用中值滤波器的结果.为了去除噪声，我应用了中值滤波器 (5x5)，随后使用线元素 (5,11) 进行了图像膨胀.即使现在 OCR (Matlab 2014b) 也只能识别部分字符

无论如何，非常感谢到目前为止的建议.我仍然会等着看是否有人可以提出不同的建议，也许是开箱即用的:)

以下 Spektre 代码中步骤的 Matlab 实现结果(没有笔划扩张(按 1、2、3、4 的顺序对角进行标准化:

并且阈值 tr0=400 和 tr1=180 以及归一化的角顺序 1,3,2,4

最好的问候

瓦贾哈特

解决方案

我对你的意见做了一些尝试

光照归一化 + 动态范围归一化有助于获得更好的结果，但距离需要的还很远.我想尝试锐化部分派生，以增强背景中的字母并在整合回来之前调整小凹凸并重新着色以掩盖图像，当我有时间时(不确定可能明天)我将编辑此(并评论/通知你)

标准化光照

计算平均角强度并双线性重新调整强度以匹配平均颜色

如果您需要更复杂的东西，请参阅:

你也可以用不同的光线位置和固定的相机拍摄2张图像，然后将它们组合起来，从四面八方覆盖可识别的黑色区域

[edit2] 最后一个方法的 C++ 源代码

//----------------------------------------------------------------------------typedef union { int dd;短 int dw[2];字节 db[4];} 颜色;图片 pic0,pic1,pic2;//pic0 源图像，pic1 归一化+最小值/最大值，pic2 放大过滤器//---------------------------------------------------------------------------空过滤器(){整数 sz=16;//[像素] 角平均颜色计算的正方形大小 (c00..c11)int fs0=5;//蓝色 [像素] 字体粗细int fs1=2;//红色 [像素] 字体粗细int tr0=320；//蓝色最小阈值int tr1=125;//红色最大阈值int x,y,c,cavg,cmin,cmax;pic1=pic0;//复制源图像pic1.rgb2i();//转换为灰度强度对于 (x=0;x<5;x++) pic1.ui_smooth();cavg=pic1.ui_normalize();//最小最大过滤器cmin=pic1.p[0][0].dd;cmax=cmin;对于 (y=0;y画布->钢笔->颜色=clWhite；pic2.bmp->画布->画笔->颜色=clWhite；对于 (y=0;yCanvas->Pen->Color=clRed;pic2.bmp->画布->画笔->颜色=clRed；pic2.bmp->画布->椭圆(x-fs1,y-fs1,x+fs1,y+fs1);//红色的}如果(c==0x000000FF){pic2.bmp->画布->笔->颜色=clBlue；pic2.bmp->画布->画笔->颜色=clBlue；pic2.bmp->画布->椭圆(x-fs0,y-fs0,x+fs0,y+fs0);//蓝色}}}//---------------------------------------------------------------------------int 图片::ui_normalize(int sz=32){如果 (xs

我使用自己的图片类来处理图像，因此一些成员是:

xs,ys 图像的像素大小
p[y][x].dd 是 (x,y) 位置的像素，为 32 位整数类型
clear(color) - 清除整个图像
resize(xs,ys) - 将图像大小调整为新的分辨率
bmp - VCL 封装的 GDI 位图，具有 Canvas 访问权限

我只为 2 个相关成员函数添加了源代码(无需在此处复制整个类)

[edit3] LQ 图像

我发现的最佳设置(代码相同):

int sz=32;//[像素] 角平均颜色计算的正方形大小 (c00..c11)int fs0=2;//蓝色 [像素] 字体粗细int fs1=2;//红色 [像素] 字体粗细int tr0=52;//蓝色最小阈值int tr1=0;//红色最大阈值

由于光照条件，红色区域无法使用(关闭)

I am working with text recognition on tires. In order to use an OCR, I must first get a clear binary map.



I have processed images and the text appears with broken and discontinued edges.
I have tried standard erosion/dilation with circular discs and line element in MATLAB, but it does not really help.

Pr1- Any ideas on how to reconstruct these characters and fill the gap in between the strokes of characters?





Pr2- The images above are higher resolution and under good illumination. 
However, if the illumination is poor and resolution is comparatively low as in the image below, what would be the viable options for processing?



Solutions tried:

S1: This is the result of application of median filter to the processed image shared by Spektre. To remove noise I applied a median filter (5x5) and subsequently image dilation with a line element (5,11). Even now the OCR (Matlab 2014b) can only recognize some of the characters

Anyway, thanks a lot for suggestions so far. I will still wait to see if someone can suggest something different perhaps thinking out of the box :).



Results of Matlab implementation of the steps from Spektre's code below (without stroke dilation (normalization with corners in order of 1,2,3,4:



and with threshold tr0=400 and tr1=180 and corner order for normalization 1,3,2,4


Best Regards

Wajahat
 解决方案 
I have played a bit with your input
Normalization of lighting + dynamic range normalization helps a bit to obtain much better results but still far away from needed one. I would like to try sharpening of partial derivations to boost the letters from background and treshold out small bumps before integrate back and recolor to mask image when I will have the time (not sure when maybe tomorow) I will edit this (and comment/notify you)
normalized lighting
compute average corners intensity and bilinear-ly rescale the intensities to match average color

if you need something more sophisticated see:

OpenCV for OCR: How to compute thresholding levels for gray image OCR

edge detection
partial derivation of intensity i by x and y...

i=|i(x,y)/dx|+|i(x,y)/dy|

and then tresholded by treshold=13

[notes]
To eliminate most noise I applied smooth filtering before edge detection
[edit1] after some analysis I found your image has poor edges for sharpening integration
Here example of intensity graph after first derivation by x in the middle line of image

As you can see the black areas are fine but the white-ish ones are almost non recognizable from background noise. So your only hope is to use the min max filtering as @Daniel answer suggested and take more weight on black edge regions (white are not reliable)

min max filter emphasize the black (blue mask) and white (red mask) regions. If booth areas would be reliable then you just fill the space between them but that is not an option in your case instead I would enlarge the areas (weighted more on blue mask) and OCR the result with OCR customized for such 3 color input.

You can make your own custom OCR for this see OCR and character similarity

you could also take 2 images with different light position and fixed camera and combine them to cover the recognizable black area from all sides
[edit2] C++ source code for the last method
//---------------------------------------------------------------------------
typedef union { int dd; short int dw[2]; byte db[4]; } color;
picture pic0,pic1,pic2; // pic0 source image,pic1 normalized+min/max,pic2 enlarge filter
//---------------------------------------------------------------------------
void filter()
    {
    int sz=16;          // [pixels] square size for corner avg color computation (c00..c11)
    int fs0=5;          // blue [pixels] font thickness
    int fs1=2;          // red  [pixels] font thickness
    int tr0=320;        // blue min treshold
    int tr1=125;        // red  max treshold

    int x,y,c,cavg,cmin,cmax;
    pic1=pic0;          // copy source image
    pic1.rgb2i();       // convert to grayscale intensity

    for (x=0;x<5;x++) pic1.ui_smooth();
    cavg=pic1.ui_normalize();

    // min max filter
    cmin=pic1.p[0][0].dd; cmax=cmin;
    for (y=0;y<pic1.ys;y++)
     for (x=0;x<pic1.xs;x++)
        {
        c=pic1.p[y][x].dd;
        if (cmin>c) cmin=c;
        if (cmax<c) cmax=c;
        }
    // treshold min/max
    for (y=0;y<pic1.ys;y++)
     for (x=0;x<pic1.xs;x++)
        {
        c=pic1.p[y][x].dd;
             if (cmax-c<tr1) c=0x00FF0000; // red
        else if (c-cmin<tr0) c=0x000000FF; // blue
        else                 c=0x00000000; // black
        pic1.p[y][x].dd=c;
        }
    pic1.rgb_smooth();  // remove single dots

    // recolor image
    pic2=pic1; pic2.clear(0);
    pic2.bmp->Canvas->Pen  ->Color=clWhite;
    pic2.bmp->Canvas->Brush->Color=clWhite;
    for (y=0;y<pic1.ys;y++)
     for (x=0;x<pic1.xs;x++)
        {
        c=pic1.p[y][x].dd;
        if (c==0x00FF0000)
            {
            pic2.bmp->Canvas->Pen  ->Color=clRed;
            pic2.bmp->Canvas->Brush->Color=clRed;
            pic2.bmp->Canvas->Ellipse(x-fs1,y-fs1,x+fs1,y+fs1); // red
            }
        if (c==0x000000FF)
            {
            pic2.bmp->Canvas->Pen  ->Color=clBlue;
            pic2.bmp->Canvas->Brush->Color=clBlue;
            pic2.bmp->Canvas->Ellipse(x-fs0,y-fs0,x+fs0,y+fs0); // blue
            }
        }
    }
//---------------------------------------------------------------------------
int  picture::ui_normalize(int sz=32)
    {
    if (xs<sz) return 0;
    if (ys<sz) return 0;
    int x,y,c,c0,c1,c00,c01,c10,c11,cavg;

    // compute average intensity in corners
    for (c00=0,y=         0;y<     sz;y++) for (x=         0;x<     sz;x++) c00+=p[y][x].dd; c00/=sz*sz;
    for (c01=0,y=         0;y<     sz;y++) for (x=xs-sz;x<xs;x++) c01+=p[y][x].dd; c01/=sz*sz;
    for (c10=0,y=ys-sz;y<ys;y++) for (x=         0;x<     sz;x++) c10+=p[y][x].dd; c10/=sz*sz;
    for (c11=0,y=ys-sz;y<ys;y++) for (x=xs-sz;x<xs;x++) c11+=p[y][x].dd; c11/=sz*sz;
    cavg=(c00+c01+c10+c11)/4;

    // normalize lighting conditions
    for (y=0;y<ys;y++)
     for (x=0;x<xs;x++)
        {
        // avg color = bilinear interpolation of corners colors
        c0=c00+(((c01-c00)*x)/xs);
        c1=c10+(((c11-c10)*x)/xs);
        c =c0 +(((c1 -c0 )*y)/ys);
        // scale to avg color
        if (c) p[y][x].dd=(p[y][x].dd*cavg)/c;
        }
    // compute min max intensities
    for (c0=0,c1=0,y=0;y<ys;y++)
     for (x=0;x<xs;x++)
        {
        c=p[y][x].dd;
        if (c0>c) c0=c;
        if (c1<c) c1=c;
        }
    // maximize dynamic range <0,765>
    for (y=0;y<ys;y++)
     for (x=0;x<xs;x++)
      c=((p[y][x].dd-c0)*765)/(c1-c0);
    return cavg;
    }
//---------------------------------------------------------------------------
void picture::rgb_smooth()
    {
    color   *q0,*q1;
    int     x,y,i;
    color   c0,c1,c2;
    if ((xs<2)||(ys<2)) return;
    for (y=0;y<ys-1;y++)
        {
        q0=p[y  ];
        q1=p[y+1];
        for (x=0;x<xs-1;x++)
            {
            c0=q0[x];
            c1=q0[x+1];
            c2=q1[x];
            for (i=0;i<4;i++) q0[x].db[i]=WORD((WORD(c0.db[i])+WORD(c0.db[i])+WORD(c1.db[i])+WORD(c2.db[i]))>>2);
            }
        }
    }
//---------------------------------------------------------------------------
I use my own picture class for images so some members are:

xs,ys size of image in pixels
p[y][x].dd is pixel at (x,y) position as 32 bit integer type
clear(color) - clears entire image
resize(xs,ys) - resizes image to new resolution
bmp - VCL encapsulated GDI Bitmap with Canvas access

I added source just for 2 relevant member functions (no need to copy whole class here)
[edit3] LQ image
The best setting I found (code is the same):
int sz=32;          // [pixels] square size for corner avg color computation (c00..c11)
int fs0=2;          // blue [pixels] font thickness
int fs1=2;          // red  [pixels] font thickness
int tr0=52;         // blue min treshold
int tr1=0;          // red  max treshold

Due to lighting conditions the red area is unusable (turned off)

                        这篇关于OCR 的字符重建和填充的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

OCR 的字符重建和填充 [英] character reconstruction and filling for OCR

问题描述

尝试的解决方案:

Solutions tried:

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

OCR 的字符重建和填充 [英] character reconstruction and filling for OCR

问题描述

尝试的解决方案:

Solutions tried:

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭