Python Tesseract无法识别此字体 [英] Python Tesseract can't recognize this font

查看:323
本文介绍了Python Tesseract无法识别此字体的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这张图片:

我想使用python将其读取为字符串,我认为这并不难.我遇到过tesseract,然后使用tesseract封装了python脚本.

I want to read it to a string using python, which I didn't think would be that hard. I came upon tesseract, and then a wrapper for python scripts using tesseract.

所以我开始阅读图像,在我尝试阅读此图像之前,它做得非常好.我是否必须训练它以阅读特定字体?关于该特定字体有什么想法吗?还是有更好的OCR引擎可以与python配合使用来完成这项工作.

So I started reading images, and it's done great until I tried to read this one. Am i going to have to train it to read that specific font? Any ideas on what that specific font is? Or is there a better ocr engine I could use with python to get this job done.

也许我可以在数字周围做一些矢量,然后以更大的尺寸重新绘制它们?较大的图像似乎更好地显示了tesseract ocr(毫不奇怪,大声笑).

Perhaps I could make some sort of vector around the numbers, then redraw them in a larger size? The larger images are the better tesseract ocr seems to read them (no surprise lol).

推荐答案

只需为引擎训练10位数字和一个'.'. .那应该做.并确保在对图像进行OCR处理之前将其更改为灰度.

Just train the engine for the 10 digits and a '.' . That should do it. And make sure you change your image to grayscale before OCRing it.

这篇关于Python Tesseract无法识别此字体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆