Tesseract OCR简单示例 [英] Tesseract OCR simple example

查看:103
本文介绍了Tesseract OCR简单示例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您能给我一个简单的Tesseract OCR测试示例吗? 最好在C#中使用.
我尝试在此处找到的演示. 我下载了英文数据集并解压缩到C驱动器中.并修改了代码,如下所示:

Hi Can you anyone give me a simple example of testing Tesseract OCR preferably in C#.
I tried the demo found here. I download the English dataset and unzipped in C drive. and modified the code as followings:

string path = @"C:\pic\mytext.jpg";
Bitmap image = new Bitmap(path);
Tesseract ocr = new Tesseract();
ocr.SetVariable("tessedit_char_whitelist", "0123456789"); // If digit only
ocr.Init(@"C:\tessdata\", "eng", false); // To use correct tessdata
List<tessnet2.Word> result = ocr.DoOCR(image, Rectangle.Empty);
foreach (tessnet2.Word word in result)
    Console.WriteLine("{0} : {1}", word.Confidence, word.Text);

不幸的是,该代码无法正常工作.该程序死于"ocr.Init(..."行.即使使用try-catch我也无法获得异常.

Unfortunately the code doesn't work. the program dies at "ocr.Init(..." line. I couldn't even get an exception even using try-catch.

我能够运行 vietocr !但这对我来说是一个非常大的项目.我需要一个像上面这样的简单例子.

I was able to run the vietocr! but that is a very large project for me to follow. i need a simple example like above.

谢谢

推荐答案

好.我在这里找到了解决方案 tessnet2无法加载 亚当给出的答案

Ok. I found the solution here tessnet2 fails to load the Ans given by Adam

显然我使用的是错误版本的tessdata.我直观地遵循了源页面的说明,这导致了问题.

Apparently i was using wrong version of tessdata. I was following the the source page instruction intuitively and that caused the problem.

它说

快速使用Tessnet2

Quick Tessnet2 usage

  1. 在此处下载二进制文件,将程序集Tessnet2.dll的引用添加到您的.NET项目.

  1. Download binary here, add a reference of the assembly Tessnet2.dll to your .NET project.

此处下载语言数据定义文件并放入它在tessdata目录中. Tessdata目录,并且您的exe必须位于 同一目录.

Download language data definition file here and put it in tessdata directory. Tessdata directory and your exe must be in the same directory.

下载二进制文件后,当您单击链接下载语言文件时,会有许多语言文件.但它们都不是正确的版本.您需要选择所有版本,然后转到下一页以获取正确版本(tesseract-2.00.eng)!他们应该将下载二进制链接更新到版本3,或者将版本2语言文件放在首页上.或至少要大胆地提及这个版本问题很重要!

After you download the binary, when you follow the link to download the language file, there are many language files. but none of them are right version. you need to select all version and go to next page for correct version (tesseract-2.00.eng)! They should either update download binary link to version 3 or put the the version 2 language file on the first page. Or at least bold mention the fact that this version issue is a big deal!

无论如何,我找到了它. 谢谢大家.

Anyway I found it. Thanks everyone.

这篇关于Tesseract OCR简单示例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆