Python Tesseract无法识别此字体 [英] Python Tesseract can't recognize this font
问题描述
我有这张图片:
我想使用python将其读取为字符串,我认为这并不难.我遇到过tesseract,然后使用tesseract封装了python脚本.
I want to read it to a string using python, which I didn't think would be that hard. I came upon tesseract, and then a wrapper for python scripts using tesseract.
所以我开始阅读图像,在我尝试阅读此图像之前,它做得非常好.我是否必须训练它以阅读特定字体?关于该特定字体有什么想法吗?还是有更好的OCR引擎可以与python配合使用来完成这项工作.
So I started reading images, and it's done great until I tried to read this one. Am i going to have to train it to read that specific font? Any ideas on what that specific font is? Or is there a better ocr engine I could use with python to get this job done.
也许我可以在数字周围做一些矢量,然后以更大的尺寸重新绘制它们?较大的图像似乎更好地显示了tesseract ocr(毫不奇怪,大声笑).
Perhaps I could make some sort of vector around the numbers, then redraw them in a larger size? The larger images are the better tesseract ocr seems to read them (no surprise lol).
推荐答案
只需为引擎训练10位数字和一个'.'. .那应该做.并确保在对图像进行OCR处理之前将其更改为灰度.
Just train the engine for the 10 digits and a '.' . That should do it. And make sure you change your image to grayscale before OCRing it.
这篇关于Python Tesseract无法识别此字体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!