从图像中改善pytesseract正确的文本识别 [英] Improving pytesseract correct text recognition from image

查看:313
本文介绍了从图像中改善pytesseract正确的文本识别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 pytesseract 模块阅读验证码.它在大多数情况下(但并非始终)提供准确的文本.

I am trying to read captcha using pytesseract module. And it is giving accurate text most of the time, but not all the time.

这是读取图像,处理图像并从图像中提取文本的代码.

This is code to read the image, manipulate the image and extract text from the image.

import cv2
import numpy as np
import pytesseract

def read_captcha():
    # opencv loads the image in BGR, convert it to RGB
    img = cv2.cvtColor(cv2.imread('captcha.png'), cv2.COLOR_BGR2RGB)

    lower_white = np.array([200, 200, 200], dtype=np.uint8)
    upper_white = np.array([255, 255, 255], dtype=np.uint8)

    mask = cv2.inRange(img, lower_white, upper_white)  # could also use threshold
    mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3)))  # "erase" the small white points in the resulting mask
    mask = cv2.bitwise_not(mask)  # invert mask

    # load background (could be an image too)
    bk = np.full(img.shape, 255, dtype=np.uint8)  # white bk

    # get masked foreground
    fg_masked = cv2.bitwise_and(img, img, mask=mask)

    # get masked background, mask must be inverted 
    mask = cv2.bitwise_not(mask)
    bk_masked = cv2.bitwise_and(bk, bk, mask=mask)

    # combine masked foreground and masked background 
    final = cv2.bitwise_or(fg_masked, bk_masked)
    mask = cv2.bitwise_not(mask)  # revert mask to original

    # resize the image
    img = cv2.resize(mask,(0,0),fx=3,fy=3)
    cv2.imwrite('ocr.png', img)

    text = pytesseract.image_to_string(cv2.imread('ocr.png'), lang='eng')

    return text

对于图像的处理,我从 stackoverflow 获得帮助a>帖子.

For manipulation of the image, I have got help from this stackoverflow post.

这是原始的验证码图像:

And this the original captcha image:

此图像是在操作后生成的:

And this image is generated after the manipulation:

但是,通过使用 pytesseract ,我得到了文本: AX#7rL .

But, by using pytesseract, I am getting text: AX#7rL.

有人可以指导我如何将成功率提高到100%吗?

Can anyone guide me on how to improve the success rate to 100% here?

推荐答案

由于最终图像中有微小孔,因此应在此处进行形态转换,尤其是cv2.MORPH_CLOSE,以闭合孔并平滑图像,

Since there are tiny holes in your resulting image, morphological transformations, specifically cv2.MORPH_CLOSE, to close the holes and smooth the image should work here

阈值以获得二进制图像(黑色和白色)

Threshold to obtain a binary image (black and white)

执行形态学操作关闭前景上有小孔

反转图像以获得结果

4X#7rL

4X#7rL

cv2.GaussianBlur()可能在插入tesseract之前也会有所帮助

Potentially a cv2.GaussianBlur() before inserting into tesseract would help too

import cv2
import pytesseract

# Path for Windows
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Read in image as grayscale
image = cv2.imread('1.png',0)
# Threshold to obtain binary image
thresh = cv2.threshold(image, 220, 255, cv2.THRESH_BINARY)[1]

# Create custom kernel
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
# Perform closing (dilation followed by erosion)
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)

# Invert image to use for Tesseract
result = 255 - close
cv2.imshow('thresh', thresh)
cv2.imshow('close', close)
cv2.imshow('result', result)

# Throw image into tesseract
print(pytesseract.image_to_string(result))
cv2.waitKey()

这篇关于从图像中改善pytesseract正确的文本识别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆