OCR的字符重建和填充 [英] character reconstruction and filling for OCR
问题描述
我正在轮胎上进行文字识别。
为了使用OCR,我必须首先得到一个清晰的二进制映射。
我已经处理了图像,文本出现了断开和不连续的边缘。
我在MATLAB中尝试使用圆形光盘和线条元素进行标准侵蚀/扩张,但它并没有真正帮助。
Pr1-关于如何重建的任何想法这些字符填补了字符笔划之间的空白?
Pr2-上面的图像分辨率更高,照度更好。
但是,如果照度较差且分辨率相对较低,如下图所示,处理的可行选项是什么?
尝试的解决方案:
S1:这是应用的结果中值滤波器到Spektre共享的已处理图像。为了消除噪声,我应用了中值滤波器(5x5),随后使用线元素(5,11)进行图像扩张。即使是现在OCR(Matlab 2014b)也只能识别一些字符
无论如何,非常感谢你们的建议。我仍然会等着看是否有人可以提出一些不同的想法或者开箱即用的想法:)。
Matlab实现以下Spektre代码步骤的结果(没有中风扩张(角点按1,2,3,4的顺序归一化:
并且阈值tr0 = 400且tr1 = 180且标准化的角点顺序1,3,2,4
最好的问候
Wajahat
照明标准化+动态范围标准化有助于获得更好的效果但仍然远离需要一。我想尝试锐化部分派生来提升背景中的字母和阈值输出小凸起,然后重新集成并重新着色以掩盖图像,当我将有时间(不确定何时可能是tomorow)我将编辑此(和评论/通知)你)
标准化照明
计算平均角点强度和双线性重新缩放匹配平均颜色的强度
边缘检测
部分强度推导 i
由 x
和 y
...
-
i = | i(x,y)/ dx | + | i(x,y)/ dy |
然后根据 treshold = 13
[注释]
消除大部分噪音在边缘检测之前应用平滑滤波
[edit1]经过一些分析我发现你的图像边缘锐化锐化整合
此处图像中间线x首次推导后的强度图示例
正如你所看到的那样黑色区域很好但是白色的几乎是无法识别背景噪音。因此,您唯一的希望是使用最小值最大值过滤作为建议的 @Daniel 回答,并对黑边区域采取更多权重(白色不可靠)
min max filter强调黑色(蓝色遮罩)和白色(红色遮罩)区域。如果展位区域是可靠的,那么你只需要填充它们之间的空间,但这不是你的选择,而是我会扩大区域(在蓝色面具上加权更多)和OCR结果为OCR定制这样的3种颜色输入。
- 您可以为此制作自己的自定义OCR,请参阅
由于光线条件,红色区域无法使用(关闭)
I am working with text recognition on tires. In order to use an OCR, I must first get a clear binary map.
I have processed images and the text appears with broken and discontinued edges. I have tried standard erosion/dilation with circular discs and line element in MATLAB, but it does not really help.
Pr1- Any ideas on how to reconstruct these characters and fill the gap in between the strokes of characters?
Pr2- The images above are higher resolution and under good illumination. However, if the illumination is poor and resolution is comparatively low as in the image below, what would be the viable options for processing?
Solutions tried:
S1: This is the result of application of median filter to the processed image shared by Spektre. To remove noise I applied a median filter (5x5) and subsequently image dilation with a line element (5,11). Even now the OCR (Matlab 2014b) can only recognize some of the characters
Anyway, thanks a lot for suggestions so far. I will still wait to see if someone can suggest something different perhaps thinking out of the box :).
Results of Matlab implementation of the steps from Spektre's code below (without stroke dilation (normalization with corners in order of 1,2,3,4:
and with threshold tr0=400 and tr1=180 and corner order for normalization 1,3,2,4
Best Regards
Wajahat
解决方案I have played a bit with your input
Normalization of lighting + dynamic range normalization helps a bit to obtain much better results but still far away from needed one. I would like to try sharpening of partial derivations to boost the letters from background and treshold out small bumps before integrate back and recolor to mask image when I will have the time (not sure when maybe tomorow) I will edit this (and comment/notify you)
normalized lighting
compute average corners intensity and bilinear-ly rescale the intensities to match average color
edge detection
partial derivation of intensity
i
byx
andy
...i=|i(x,y)/dx|+|i(x,y)/dy|
and then tresholded by
treshold=13
[notes]
To eliminate most noise I applied smooth filtering before edge detection
[edit1] after some analysis I found your image has poor edges for sharpening integration
Here example of intensity graph after first derivation by x in the middle line of image
As you can see the black areas are fine but the white-ish ones are almost non recognizable from background noise. So your only hope is to use the min max filtering as @Daniel answer suggested and take more weight on black edge regions (white are not reliable)
min max filter emphasize the black (blue mask) and white (red mask) regions. If booth areas would be reliable then you just fill the space between them but that is not an option in your case instead I would enlarge the areas (weighted more on blue mask) and OCR the result with OCR customized for such 3 color input.
- You can make your own custom OCR for this see OCR and character similarity
you could also take 2 images with different light position and fixed camera and combine them to cover the recognizable black area from all sides
[edit2] C++ source code for the last method
//--------------------------------------------------------------------------- typedef union { int dd; short int dw[2]; byte db[4]; } color; picture pic0,pic1,pic2; // pic0 source image,pic1 normalized+min/max,pic2 enlarge filter //--------------------------------------------------------------------------- void filter() { int sz=16; // [pixels] square size for corner avg color computation (c00..c11) int fs0=5; // blue [pixels] font thickness int fs1=2; // red [pixels] font thickness int tr0=320; // blue min treshold int tr1=125; // red max treshold int x,y,c,cavg,cmin,cmax; pic1=pic0; // copy source image pic1.rgb2i(); // convert to grayscale intensity for (x=0;x<5;x++) pic1.ui_smooth(); cavg=pic1.ui_normalize(); // min max filter cmin=pic1.p[0][0].dd; cmax=cmin; for (y=0;y<pic1.ys;y++) for (x=0;x<pic1.xs;x++) { c=pic1.p[y][x].dd; if (cmin>c) cmin=c; if (cmax<c) cmax=c; } // treshold min/max for (y=0;y<pic1.ys;y++) for (x=0;x<pic1.xs;x++) { c=pic1.p[y][x].dd; if (cmax-c<tr1) c=0x00FF0000; // red else if (c-cmin<tr0) c=0x000000FF; // blue else c=0x00000000; // black pic1.p[y][x].dd=c; } pic1.rgb_smooth(); // remove single dots // recolor image pic2=pic1; pic2.clear(0); pic2.bmp->Canvas->Pen ->Color=clWhite; pic2.bmp->Canvas->Brush->Color=clWhite; for (y=0;y<pic1.ys;y++) for (x=0;x<pic1.xs;x++) { c=pic1.p[y][x].dd; if (c==0x00FF0000) { pic2.bmp->Canvas->Pen ->Color=clRed; pic2.bmp->Canvas->Brush->Color=clRed; pic2.bmp->Canvas->Ellipse(x-fs1,y-fs1,x+fs1,y+fs1); // red } if (c==0x000000FF) { pic2.bmp->Canvas->Pen ->Color=clBlue; pic2.bmp->Canvas->Brush->Color=clBlue; pic2.bmp->Canvas->Ellipse(x-fs0,y-fs0,x+fs0,y+fs0); // blue } } } //--------------------------------------------------------------------------- int picture::ui_normalize(int sz=32) { if (xs<sz) return 0; if (ys<sz) return 0; int x,y,c,c0,c1,c00,c01,c10,c11,cavg; // compute average intensity in corners for (c00=0,y= 0;y< sz;y++) for (x= 0;x< sz;x++) c00+=p[y][x].dd; c00/=sz*sz; for (c01=0,y= 0;y< sz;y++) for (x=xs-sz;x<xs;x++) c01+=p[y][x].dd; c01/=sz*sz; for (c10=0,y=ys-sz;y<ys;y++) for (x= 0;x< sz;x++) c10+=p[y][x].dd; c10/=sz*sz; for (c11=0,y=ys-sz;y<ys;y++) for (x=xs-sz;x<xs;x++) c11+=p[y][x].dd; c11/=sz*sz; cavg=(c00+c01+c10+c11)/4; // normalize lighting conditions for (y=0;y<ys;y++) for (x=0;x<xs;x++) { // avg color = bilinear interpolation of corners colors c0=c00+(((c01-c00)*x)/xs); c1=c10+(((c11-c10)*x)/xs); c =c0 +(((c1 -c0 )*y)/ys); // scale to avg color if (c) p[y][x].dd=(p[y][x].dd*cavg)/c; } // compute min max intensities for (c0=0,c1=0,y=0;y<ys;y++) for (x=0;x<xs;x++) { c=p[y][x].dd; if (c0>c) c0=c; if (c1<c) c1=c; } // maximize dynamic range <0,765> for (y=0;y<ys;y++) for (x=0;x<xs;x++) c=((p[y][x].dd-c0)*765)/(c1-c0); return cavg; } //--------------------------------------------------------------------------- void picture::rgb_smooth() { color *q0,*q1; int x,y,i; color c0,c1,c2; if ((xs<2)||(ys<2)) return; for (y=0;y<ys-1;y++) { q0=p[y ]; q1=p[y+1]; for (x=0;x<xs-1;x++) { c0=q0[x]; c1=q0[x+1]; c2=q1[x]; for (i=0;i<4;i++) q0[x].db[i]=WORD((WORD(c0.db[i])+WORD(c0.db[i])+WORD(c1.db[i])+WORD(c2.db[i]))>>2); } } } //---------------------------------------------------------------------------
I use my own picture class for images so some members are:
xs,ys
size of image in pixelsp[y][x].dd
is pixel at(x,y)
position as 32 bit integer typeclear(color)
- clears entire imageresize(xs,ys)
- resizes image to new resolutionbmp
- VCL encapsulated GDI Bitmap with Canvas access
I added source just for 2 relevant member functions (no need to copy whole class here)
[edit3] LQ image
The best setting I found (code is the same):
int sz=32; // [pixels] square size for corner avg color computation (c00..c11) int fs0=2; // blue [pixels] font thickness int fs1=2; // red [pixels] font thickness int tr0=52; // blue min treshold int tr1=0; // red max treshold
Due to lighting conditions the red area is unusable (turned off)
这篇关于OCR的字符重建和填充的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!