Tesseract是否忽略了扫描文档中的任何非文本区域？ [英] Does Tesseract neglect any nontext area in a scanned document?

查看：143 发布时间：2018/7/30 17:10:57 image-processing ocr tesseract text-extraction

本文介绍了Tesseract是否忽略了扫描文档中的任何非文本区域？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用Tesseract，但我不知道它是否忽略了任何非文本区域和目标文本。我是否必须删除任何非文本区域作为更好输出的预处理步骤？

I'm using Tesseract but I don't know whether it neglects any nontext area and targets text only. Do I have to remove any nontext area as a preprocessing step for better output?

推荐答案

Tesseract有一个很好的算法来检测文本，但它最终会给出假阳性的匹配。

Tesseract has a pretty good algorithm to detect text, but it will eventually give false-positive matches.

理想情况下，您需要在将图像提交给tesseract之前对其进行预处理。前段时间我参与了类似的任务，所以我建议你看一下以下材料：

Ideally, you would pre-process the image before submitting it to tesseract. Some time ago I engaged in a similar task, so I suggest you take a look at the following material:

OpenCV C ++ / Obj-C：检测a纸张/方形检测

执行cv :: warpPerspective以在一组cv :: Point上进行虚假纠正

Executing cv::warpPerspective for a fake deskewing on a set of cv::Point

使用cv :: warpAffine偏移旋转cv :: Mat目标图片

仿射变换，简单旋转和缩放或其他完全不同的东西？

这篇关于Tesseract是否忽略了扫描文档中的任何非文本区域？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Tesseract是否忽略了扫描文档中的任何非文本区域？ [英] Does Tesseract neglect any nontext area in a scanned document?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Tesseract是否忽略了扫描文档中的任何非文本区域？ [英] Does Tesseract neglect any nontext area in a scanned document?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭