使用Opencv和pytesseract进行验证码预处理和求解 [英] Captcha preprocessing and solving with Opencv and pytesseract

查看:183
本文介绍了使用Opencv和pytesseract进行验证码预处理和求解的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题

我正在尝试使用Tesseract-OCR用Python编写用于图像预处理和识别的代码.我的目标是可靠地解决这种形式的验证码.

I am trying to write code in Python for the Image preprocessing and recognition using Tesseract-OCR. My goal is to solve this form of captcha reliably.

原始验证码和每个预处理步骤的结果

截至目前的步骤

  1. 图像的灰度和阈值

  1. Greyscale and thresholding of image

使用PIL增强图像

转换为TIF并缩放到> 300px

Convert to TIF and scale to >300px

将其提供给Tesseract-OCR(将所有大写字母列入白名单)

Feed it to Tesseract-OCR (whitelisting all uppercase alphabets)

但是,我仍然得到相当不正确的阅读(EPQ M Q).我还可以采取哪些其他预处理步骤来提高准确性?我的代码和类似性质的其他验证码附在下面.

However, I still get an rather incorrect reading (EPQ M Q). What other preprocessing steps can I take to improve accuracy? My code and additional captcha of similar nature are appended below.

我想解决的类似验证码

代码

import cv2
import pytesseract
from PIL import Image, ImageEnhance, ImageFilter
def binarize_image_using_opencv(captcha_path, binary_image_path='input-black-n-white.jpg'):
     im_gray = cv2.imread(captcha_path, cv2.IMREAD_GRAYSCALE)
     (thresh, im_bw) = cv2.threshold(im_gray, 85, 255, cv2.THRESH_BINARY)
     # although thresh is used below, gonna pick something suitable
     im_bw = cv2.threshold(im_gray, thresh, 255, cv2.THRESH_BINARY)[1]
     cv2.imwrite(binary_image_path, im_bw)

     return binary_image_path

def preprocess_image_using_opencv(captcha_path):
     bin_image_path = binarize_image_using_opencv(captcha_path)

     im_bin = Image.open(bin_image_path)
     basewidth = 300  # in pixels
     wpercent = (basewidth/float(im_bin.size[0]))
     hsize = int((float(im_bin.size[1])*float(wpercent)))
     big = im_bin.resize((basewidth, hsize), Image.NEAREST)

     # tesseract-ocr only works with TIF so save the bigger image in that format
     tif_file = "input-NEAREST.tif"
     big.save(tif_file)

     return tif_file

def get_captcha_text_from_captcha_image(captcha_path):

     # Preprocess the image befor OCR
     tif_file = preprocess_image_using_opencv(captcha_path)



get_captcha_text_from_captcha_image("path/captcha.png")

im = Image.open("input-NEAREST.tif") # the second one 
im = im.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(2)
im = im.convert('1')
im.save('captchafinal.tif')
text = pytesseract.image_to_string(Image.open('captchafinal.tif'), config="-c 
tessedit_char_whitelist=ABCDEFGHIJKLMNOPQRSTUVWXYZ -psm 6")
print(text)

推荐答案

主要问题来自字母的不同方向,而不是预处理阶段.您进行了通用预处理,该方法应该很好用,但是您可以使用自适应阈值以使您的程序在图像的亮度方面更加通用.

Major problem comes from different orientations of letters, but not from preprocessing stage. You did common preprocessing which should work good, but you can replace thresholding with adaptive thresholding to make your program more general in a sense of brightness of your image.

当我与tesseract合作进行车牌识别时遇到了同样的问题.从那次经历中,我意识到,tesseract对于图像上文本的方向非常有意义.当图像上的文本为水平时,Tesseract可以很好地识别字母.水平放置的文字越多,可获得的效果越好.

I met same problem when I was working with tesseract for car license recognition. From that experience I realized that tesseract is very sensetive to orientation of the text on image. Tesseract can recognize letters well when text on image is horizontal. The more text is horizontally oriented the better result you can get.

因此,您必须创建一种算法,该算法将从验证码图像中检测每个字母,检测其方向并将其旋转以使其变为水平,然后进行预处理,然后使用tesseract处理此旋转的水平图像并将其输出存储在您得到的字符串.然后去检测下一个字母并执行相同的过程,然后在结果字符串中添加tesseract输出.您还需要图像转换功能 ,以旋转您的字母.而且,您必须考虑寻找检测到的字母的角.可能是此项目会为您提供帮助,因为它们可以旋转图像上的文本以改善esse的质量.

So you have to create algorithm which will detect each letter from your captcha image, detect its orientation and rotate it to make it horizontal and then do your preprocessing, then process this rotated horizontal piece of image with tesseract and store its output in your resulting string. Then go to detect next letter and do same process and add tesseract output in your resulting string. You will need image transformation function as well, to rotate your letters. And you have to think about finding corners of your detected letters. May be this project will help you, because they rotating text on image to improve quality of tesseract.

这篇关于使用Opencv和pytesseract进行验证码预处理和求解的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆