C#OCR(如何从图像中读取单个字符) [英] C# OCR (How to Read a single character from image)

查看:92
本文介绍了C#OCR(如何从图像中读取单个字符)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你好,
我想制作一个OCR项目,该项目可以识别C#中的单个字符和图像(而不是文档中的句子),有人可以帮我一些代码吗?我在互联网上搜索它,发现了2-3个OCR代码,但是它们太难理解了,谁能给我一些简单的代码或一些免费的库来做那件事.

在此先感谢

Hello,
I want to make and OCR project which recognizes a single character from and image(not a sentence from a document) in C#, can anyone help me out with some code? i was searching on internet about it, and i found 2-3 codes of OCR, but they were too difficult to understand, can anyone give me some simple code or some free library to do that thing.

Thanks in Advance

推荐答案

正如您所说,您需要使用OCR(光学字符识别). C#中没有用于OCR的内置函数,但是使用Microsoft Office Document Imaging Library(MODI)可能会有所帮助.查看这些链接;

如何:使用C#使用Office 2007 OCR [ http://www.devsource.com/c /a/Languages/Using-The-Office-2007-OCR-Component-in-C/ [带有Microsoft®Office的OCR [ http://code.google.com/p/tesseract-ocr/ [
[^ ]
As you said you need to use OCR(Optical Character Recognition). There is no inbuilt func for OCR in C# but maybe using Microsoft Office Document Imaging Library (MODI) might be helpful. Check out these links;

How To: Use Office 2007 OCR Using C#[^]

http://www.devsource.com/c/a/Languages/Using-The-Office-2007-OCR-Component-in-C/[^]

OCR with Microsoft® Office[^]

The free librarys that you can use;
- Tesseract
http://code.google.com/p/tesseract-ocr/[^]

-GOCR
[^]


Tesseract和GOCR不太容易使用,效果也不佳. Office是专有的,并不总是可用.

Thsre是用于Tesseract的C#绑定,称为Tessnet: http://www.pixel-technology.com/freeware/tessnet2/ [ ^ ].

在CodeProjects上有一些不错的作品.我认为它们更好,但需要完成.基本和最困难的工作已经完成.正如您提到的单个字符",其中之一必须对您来说是理想的(我尝试过).请参阅:神经网络OCR [使用神经网络创建光学字符识别(OCR)应用程序 [ ^ ], OCR线检测 [ ^ ],Unicode光学字符识别 [
Tesseract and GOCR are not easy to use and not so good; Office is proprietary, not always available.

Thsre is C# binding for Tesseract called Tessnet: http://www.pixel-technology.com/freeware/tessnet2/[^].

There are some good works on CodeProjects. I think they are better but need to be completed. Basic and most difficult work is done. As you mentioned "single character", one of them must be ideal for you (I tried them out). See: Neural Network OCR[^] (this one is one of the best), Creating Optical Character Recognition (OCR) applications using Neural Networks[^], OCR Line Detection[^], Unicode Optical Character Recognition[^] (this one is one of the best).

Good luck,
—SA


我正在使用MODI(Office OCR)来执行此操作.它就像单词,句子,数字以及两者的混合体一样具有魅力.但是-如果图像仅包含一个字符,则失败.
我不知道为什么我会失败,但是由于这是您要执行的操作,因此您可能应该选择其他ocr解决方案之一.

有谁知道为什么只有一个字母会失败吗?

这是我的代码:

I am useing the MODI (Office OCR) to do this. It is working like a charm for words, sentences, numbers and mixes of both. However - It fails if the image contains only one single character.
I have no idea why i fails, but since this is what you wanted to do, you should probably go for one of the other ocr solutions.

Is there anybody who knows why it fails with only one letter?

Here is my code:

// Load Image from File
Bitmap BWImage = new Bitmap(fileName);
// Lock destination bitmap in memory
System.Drawing.Imaging.BitmapData BWLockImage = BWImage.LockBits(new Rectangle(0, 0, BWImage.Width, BWImage.Height), System.Drawing.Imaging.ImageLockMode.WriteOnly, PixelFormat.Format1bppIndexed);

// Copy image data to binary array
int imageSize = BWLockImage.Stride * BWLockImage.Height;
byte[] BWImageBuffer = new byte[imageSize];
Marshal.Copy(BWLockImage.Scan0, BWImageBuffer, 0, imageSize);
DoOCR(BWLockImage, BWImageBuffer, tmpPosRect, false);



// Do the OCR with this function
public string DoOCR(System.Drawing.Imaging.BitmapData BWLockImage, byte[] BWImageBuffer, Rectangle iAusschnitt, bool isNumber)
{
    Bitmap tmpImage = Bildausschnitt1bpp(BWLockImage, BWImageBuffer, iAusschnitt);
    string file = Path.GetTempFileName();
    string tmpResult = "";
    try
    {
        tmpImage.Save(file, ImageFormat.Tiff);
        _MODIDocument.Create(file);
        // Modi parameter erstellen
        _MODIDocument.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, false, false);

        MODI.IImage myImage = (MODI.IImage)_MODIDocument.Images[0]; //first page in file
        MODI.ILayout myLayout = (MODI.ILayout)myImage.Layout;
        tmpResult = myLayout.Text;
    }
    catch
    {
        if (_MODIDocument != null)
        {
            _MODIDocument.Close(false); //Closes the document and deallocates the memory.
            _MODIDocument = null;
        }
        // Bild freigeben
        tmpImage.Dispose();
        tmpImage = null;
        // Garbage Collector ausführen
        GC.Collect();
        // Bilddatei löschen
        File.Delete(file);
    }
    return tmpResult;
}


这篇关于C#OCR(如何从图像中读取单个字符)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆