用于图像识别的C ++库:包含字符串的图像 [英] C++ Library for image recognition: images containing words to string

查看:199
本文介绍了用于图像识别的C ++库:包含字符串的图像的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有人知道一个C + +图书馆获取图像和执行图像识别,以便它可以找到基于给定的字体和/或字体高度的字母?即使是一个不让你选择字体的应用程序也不错(例如:readLetters(Image image)。

解决方案

如果你需要在OCR上面进行布局分析,而不是使用Ocropus(使用Tesseract来完成OCR),布局分析是指能够检测位置在图像上做文本,做线段分割,块分割等等。

通过Tesseract实验,我发现了一些非常好的提示,值得分享。


  1. 将您的输入图像放大/缩小为300 dpi。 $ b
  2. 删除图像中的颜色灰度很好,实际上我使用了抖动阈值,并将输入设置为黑白。

  3. 从图像中删除不必要的垃圾。 b $ b上面所有的三个我都使用netbpm(一组用于unix的图像处理工具)来获取指出我所需要的几乎100%的准确性。

如果你有一个高度自定义的字体,并使用tesseract你必须单独训练系统 - 基本上你必须提供一堆训练数据。这在tesseract-ocr网站上有详细记录。你基本上为你的字体创建一个新的语言,并用-l参数传入。

我发现的另一个培训机制是Ocropus使用nueral net(bpnet)培训。它需要大量的输入数据来构建一个好的统计模型。

在调用Tesseract / Ocropus方面都是C ++。它不会像ReadLines(Image)那么简单,但是有一个API可以检出。您也可以通过命令行调用。


Does anyone know of a c++ library for taking an image and performing image recognition on it such that it can find letters based on a given font and/or font height? Even one that doesn't let you select a font would be nice (eg: readLetters(Image image).

解决方案

I've been looking into this a lot lately. Your best is simply Tesseract. If you need layout analysis on top of the OCR than go with Ocropus (which in turn uses Tesseract to do the OCR). Layout analysis refers to being able to detect position of text on the image and do things like line segmentation, block segmentation, etc.

I've found some really good tips through experimentation with Tesseract that are worth sharing. Basically I had to do a lot of preprocessing for the image.

  1. Upsize/Downsize your input image to 300 dpi.
  2. Remove color from the image. Grey scale is good. I actually used a dither threshold and made my input black and white.
  3. Cut out unnecessary junk from your image. For all three above I used netbpm (a set of image manipulation tools for unix) to get to point where I was getting pretty much 100 percent accuracy for what I needed.

If you have a highly customized font and go with tesseract alone you have to "Train" the system -- basically you have to feed a bunch of training data. This is well documented on the tesseract-ocr site. You essentially create a new "language" for your font and pass it in with the -l parameter.

The other training mechanism I found was with Ocropus using nueral net (bpnet) training. It requires a lot of input data to build a good statistical model.

In terms of invoking Tesseract/Ocropus are both C++. It won't be as simple as ReadLines(Image) but there is an API you can check out. You can also invoke via command line.

这篇关于用于图像识别的C ++库:包含字符串的图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆