Tesseract无法读取这个非常简单的数字字符串 [英] Tesseract has trouble reading this extremely simple string of numbers

查看：487 发布时间：2020/5/19 19:35:42 python string ocr tesseract digits

本文介绍了Tesseract无法读取这个非常简单的数字字符串的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我目前正在用python编写一个脚本，该脚本需要使用tesseract来读取这样的数字:

I'm currently writing a script in python that requires the use of tesseract to read a number like this:

仅使用数字和-psm 6(或7)，它输出5.551

Using digits only and -psm 6 (or 7) it outputs 5.551

我在其他数字上取得了一些成功(5.700作品)，但是这个特殊的数字给我带来了很多问题.不幸的是，我的程序需要高度的准确性，但是我认为tesseract能够解密这样一个简单的字符串.

I have had some success with other numbers (5.700 works) but this particular number is giving me a ton of problems. Unfortunately i need a high degree of accuracy for my program but i thought tesseract would be able to decipher such a simple string.

我也尝试过使用GOCR，并且可以正确读取6.881(是！)，但输出5._00为5.700(boo！)

I have also tried to use GOCR and that correctly read 6.881 (yay!) but gave the output 5._00 for 5.700 (boo!)

有人知道为什么要这么做吗?

Any idea why it would be doing this?

或更重要的是，我可以做任何事情来解决这个问题(最好不用培训tesseract).

Or more importantly, anything i can do to get around the problem ( preferably without having to train tesseract ).

推荐答案

我使用Imagemagick(如果需要，可以使用其他方式)将其尺寸加倍，并删除了透明度(用白色代替)，而Tesseract OCR则对增强功能进行了改进正确显示图片:

I doubled its size and removed the transparency (replacing it with white) using Imagemagick (you can use something else if you want) and Tesseract OCR'd the enhanced image correctly:

$ convert I1Zau.png -background white -flatten -resize 200% I1Zau_2.png
$ tesseract I1Zau_2.png o.txt
$ cat o.txt.txt 
6.881

这篇关于Tesseract无法读取这个非常简单的数字字符串的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Tesseract无法读取这个非常简单的数字字符串 [英] Tesseract has trouble reading this extremely simple string of numbers

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Tesseract无法读取这个非常简单的数字字符串 [英] Tesseract has trouble reading this extremely simple string of numbers

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭