Python Tesseract OCR问题 [英] Python Tesseract OCR question

查看:134
本文介绍了Python Tesseract OCR问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这张图片:

我想用python将它读成一个字符串,我认为这不会那么难。我发现了tesseract,然后是使用tesseract的python脚本的包装器。

I want to read it to a string using python, which I didn't think would be that hard. I came upon tesseract, and then a wrapper for python scripts using tesseract.

所以我开始阅读图像,直到我尝试阅读这个图片时,它做得很好。我是否需要训练它来阅读特定的字体?有关该特定字体的任何想法?或者是否有一个更好的ocr引擎我可以使用python来完成这项工作。

So I started reading images, and it's done great until I tried to read this one. Am i going to have to train it to read that specific font? Any ideas on what that specific font is? Or is there a better ocr engine I could use with python to get this job done.

编辑:也许我可以围绕数字制作某种向量,然后重绘它们在更大的尺寸?更大的图像是更好的tesseract ocr似乎读取它们(毫不奇怪lol)。

Perhaps I could make some sort of vector around the numbers, then redraw them in a larger size? The larger images are the better tesseract ocr seems to read them (no surprise lol).

推荐答案

只需训练引擎10数字和'。'。应该这样做。并确保在OCR之前将图像更改为灰度。

Just train the engine for the 10 digits and a '.' . That should do it. And make sure you change your image to grayscale before OCRing it.

这篇关于Python Tesseract OCR问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆