如何在Tesseract和OpenCV之间进行选择? [英] How do I choose between Tesseract and OpenCV?

查看:285
本文介绍了如何在Tesseract和OpenCV之间进行选择?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近发现了 Tesseract OpenCV 。它看起来像Tesseract是一个完整的OCR引擎和OpenCV可以用作一个框架来创建OCR应用程序/服务。

I recently came across Tesseract and OpenCV. It looks like Tesseract is a full-fledged OCR engine and OpenCV can be used as a framework to create an OCR application/service.

我尝试对我的一些图像使用Tesseract,它的准确性似乎不错。后来,我遇到了一个非常简单的教程,使用OpenCV来执行OCR使用Python和印象深刻。在几分钟内,我完成了系统的训练,其准确性是好的。但是,当然,采取这种方法意味着我需要训练我的系统广泛使用大型训练集。

I tried using Tesseract on some of my images and its accuracy seems decent. Later, I came across a very simple tutorial on using OpenCV to perform OCR using Python and was impressed. In a few minutes, I finished training the system and its accuracy was good. But of course, taking this approach means I need to train my system extensively using a large training set.

我的具体问题如下:


  • 如何选择Tesseract并使用OpenCV构建自定义OCR应用程序?

  • Tesseract有不同语言的训练数据集。 OpenCV有类似的东西,所以我不必开始实现OCR?

  • 哪一个更适合一个想要的商业应用程序?

有任何建议吗?

注意:我24小时内

推荐答案


  • Tesseract是一个OCR引擎。它专门用于从图像中读取文本,执行基本的文档分割,并对特定的图像输入(单个字,行,段落,页面,有限的字典等)进行操作。

    • Tesseract is an OCR engine. It's used, worked on and funded by Google specifically to read text from images, perform basic document segmentation and operate on specific image inputs (a single word, line, paragraph, page, limited dictionaries, etc.).

      另一方面,OpenCV是一个计算机视觉库,其中包括可以执行一些特征提取和数据分类的功能。你可以创建一个简单的字母分割器和分类器来执行基本的OCR,但它不是一个非常好的OCR引擎(我从头开始在Python中创建一个,对于偏离训练数据的输入,它真的不准确) p>


    • OpenCV, on the other hand, is a computer vision library that includes features that let you perform some feature extraction and data classification. You can create a simple letter segmenter and classifier that performs basic OCR, but it is not a very good OCR engine (I've made one in Python before from scratch. It's really inaccurate for input that deviates from your training data).

      如果您想要了解OCR的难度,请尝试OpenCV。 Tesseract用于真实 OCR。

      If you want to get a basic understanding of how hard OCR is, try OpenCV. Tesseract is for real OCR.

      这篇关于如何在Tesseract和OpenCV之间进行选择?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆