为什么 pytesseract 无法识别背景较暗的图像中的数字? [英] Why does pytesseract fail to recognise digits from image with darker background?

查看:54
本文介绍了为什么 pytesseract 无法识别背景较暗的图像中的数字?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这个 python 代码,我用来将写在图片中的文本转换为字符串,它确实适用于某些具有大字符的图像,但不适用于我现在正在尝试的仅包含数字的图像.

I've this python code which I use to convert a text written in a picture to a string, it does work for certain images which have large characters, but not for the one I'm trying right now which contains only digits.

这是图片:

这是我的代码:

import pytesseract
from PIL import Image

img = Image.open('img.png')
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'
result = pytesseract.image_to_string(img)
print (result)

为什么无法识别此特定图像,我该如何解决此问题?

Why is it failing at recognising this specific image and how can I solve this problem?

推荐答案

我有两个建议.

首先,这是迄今为止最重要的,在 OCR 中预处理图像是获得良好结果的关键.在你的情况下,我建议二值化.您的图像看起来非常好,所以您不应该有任何问题,但如果您有问题,那么也许您应该尝试对图像进行二值化:

First, and this is by far the most important, in OCR preprocessing images is key to obtaining good results. In your case I suggest binarization. Your images look extremely good so you shouldn't have any problem but if you do, then maybe you should try to binarize your images:

import cv2
from PIL import Image

img = cv2.imread('gradient.png')
# If your image is not already grayscale :
# img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
threshold = 180 # to be determined
_, img_binarized = cv2.threshold(img, threshold, 255, cv2.THRESH_BINARY)
pil_img = Image.fromarray(img_binarized)

然后使用二值化图像再次尝试 ocr.

And then try the ocr again with the binarized image.

检查您的图像是否为灰度图像,并在需要时取消注释.

Check if your image is in grayscale and uncomment if needed.

这是简单的阈值.自适应阈值也存在,但它很嘈杂,对您的情况没有任何影响.

This is simple thresholding. Adaptive thresholding also exists but it is noisy and does not bring anything in your case.

对于 Tesseract 来说,二值化图像会更容易处理.这已经在内部完成(https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality) 但有时事情可能会搞砸,而且通常自己进行预处理很有用.

Binarized images will be much easier for Tesseract to handle. This is already done internally (https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality) but sometimes things can be messed up and very often it's useful to do your own preprocessing.

您可以通过查看图像来检查阈值是否正确:

You can check if the threshold value is right by looking at the images :

import matplotlib.pyplot as plt
plt.imshow(img, cmap='gray')
plt.imshow(img_binarized, cmap='gray')

第二,如果我上面说的仍然不起作用,我知道这不会回答为什么 pytesseract 在这里不起作用",但我建议您尝试 tesserocr.它是 Tesseract 的维护 Python 包装器.

Second, if what I said above still doesn't work, I know this doesn't answer "why doesn't pytesseract work here" but I suggest you try out tesserocr. It is a maintained python wrapper for Tesseract.

你可以试试:

import tesserocr
text_from_ocr = tesserocr.image_to_text(pil_img)

这是来自 pypi 的 tesserocr 文档:https://pypi.org/project/tesserocr/

Here is the doc for tesserocr from pypi : https://pypi.org/project/tesserocr/

对于 opencv:https://pypi.org/project/opencv-python/

顺便说一下,在 Tesseract 中,黑色和白色是对称处理的,因此在黑色背景上显示白色数字不是问题.

As a side-note, black and white is treated symetrically in Tesseract so having white digits on a black background is not a problem.

这篇关于为什么 pytesseract 无法识别背景较暗的图像中的数字?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆