OCR由于字体细节而失败 [英] OCR fails due to font specifics

查看：110 发布时间：2016/12/21 23:55:20 c# image winforms comparison ocr

本文介绍了OCR由于字体细节而失败的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个包含所有字体字符（Arial在我的情况下）的库。例如：

我使用这个库来OCR文字从图像。

问题是到OCR，诸如j，/，t字符可以彼此重叠！因此，OCR现在是不可能的，因为字符不匹配模式图像（最多3个像素不同） / p>

我如何处理这个问题？有没有更好的方法来比较图像？（C＃，WinForms app）

我使用此方法进行比较：

  unsafe public static bool CompareMemCmp（Bitmap b1，Bitmap b2）
 {
 if（（b1 == null）！ 
 if（b1.Size！= b2.Size）return false; 
 
 var bd1 = b1.LockBits（new Rectangle（new System.Drawing.Point（0，0），b1.Size），ImageLockMode.ReadOnly，PixelFormat.Format32bppArgb）; 
 var bd2 = b2.LockBits（new Rectangle（new System.Drawing.Point（0，0），b2.Size），ImageLockMode.ReadOnly，PixelFormat.Format32bppArgb）; 
 
 try 
 {
 IntPtr bd1scan0 = bd1.Scan0; 
 IntPtr bd2scan0 = bd2.Scan0; 
 
 int stride = bd1.Stride; 
 int len = stride * b1.Height; 
 
 return memcmp（bd1scan0，bd2scan0，len）== 0; 
} 
 finally 
 {
 b1.UnlockBits（bd1）; 
 b2.UnlockBits（bd2）; 
} 
}

这是非常快速和可靠的..但你不能得到不幸的是。

解决方案

您可以创建这些字符对（可能有不合理的金额的人虽然..）字符ie。 -j组合将被识别为-j字符。

I have a library which contains all font characters (Arial in my case). For example:

I'm using this library to OCR text from image.

The problem is that when you try to OCR such characters as "j", "/", "t" - characters could overlap one another! So OCR is now impossible, because characters do not match pattern images (up to 3 pixels are different).

How do I have to deal with this problem? Is there a better way to compare images? (C#, WinForms app)

I'm using this method for comparison:

unsafe public static bool CompareMemCmp(Bitmap b1, Bitmap b2)
    {
        if ((b1 == null) != (b2 == null)) return false;
        if (b1.Size != b2.Size) return false;

        var bd1 = b1.LockBits(new Rectangle(new System.Drawing.Point(0, 0), b1.Size), ImageLockMode.ReadOnly, PixelFormat.Format32bppArgb);
        var bd2 = b2.LockBits(new Rectangle(new System.Drawing.Point(0, 0), b2.Size), ImageLockMode.ReadOnly, PixelFormat.Format32bppArgb);

        try
        {
            IntPtr bd1scan0 = bd1.Scan0;
            IntPtr bd2scan0 = bd2.Scan0;

            int stride = bd1.Stride;
            int len = stride * b1.Height;

            return memcmp(bd1scan0, bd2scan0, len) == 0;
        }
        finally
        {
            b1.UnlockBits(bd1);
            b2.UnlockBits(bd2);
        }
    }

It's extremely fast and reliable.. but you cant get a result if condition from above is met.. unfortunately.

解决方案

You could make these character pairs (there could be an unreasonable amount of them though..) "characters" ie. the "-j" combination would be recognized as "-j" character..

这篇关于OCR由于字体细节而失败的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

OCR由于字体细节而失败 [英] OCR fails due to font specifics

问题描述

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

OCR由于字体细节而失败 [英] OCR fails due to font specifics

问题描述

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭