使用Gimp而不是我的Python代码手动预处理Image时,使用Tesseract-OCR进行文本识别的图像更好 [英] Image to text recognition using Tesseract-OCR is better when Image is preprocessed manually using Gimp than my Python Code

查看:108
本文介绍了使用Gimp而不是我的Python代码手动预处理Image时,使用Tesseract-OCR进行文本识别的图像更好的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Tesseract-OCR在Python中编写手动图像预处理和识别代码。

I am trying to write code in Python for the manual Image preprocessing and recognition using Tesseract-OCR.

手动流程:

为了手动识别单个图像的文本,我使用Gimp预处理图像并创建TIF图像。然后我把它送到Tesseract-OCR,它可以正确识别它。

Manual process:
For manually recognizing text for a single Image, I preprocess the Image using Gimp and create a TIF image. Then I feed it to Tesseract-OCR which recognizes it correctly.

使用Gimp预处理图像我做 -

To preprocess the image using Gimp I do -


  1. 将模式更改为RGB /灰度

    b $ b菜单 - 图像 - 模式 - RGB

  2. 阈值处理

    菜单 - 工具 - 颜色工具 - 阈值 - 自动

  3. 将模式更改为索引

    菜单 - 图像 - 模式 - 索引

  4. 调整大小/缩放到宽度> 300px

    菜单 - 图像 - 缩放图像 - 宽度= 300

  5. 另存为Tif

  1. Change mode to RGB / Grayscale
    Menu -- Image -- Mode -- RGB
  2. Thresholding
    Menu -- Tools -- Color Tools -- Threshold -- Auto
  3. Change mode to Indexed
    Menu -- Image -- Mode -- Indexed
  4. Resize / Scale to Width > 300px
    Menu -- Image -- Scale image -- Width=300
  5. Save as Tif

然后我把它喂给tesseract -

Then I feed it tesseract -

$ tesseract captcha.tif output -psm 6

我一直得到准确的结果。

And I get an accurate result all the time.

Python代码:

我尝试使用OpenCV和Tesseract复制上述程序 -

Python Code:
I have tried to replicate above procedure using OpenCV and Tesseract -

def binarize_image_using_opencv(captcha_path, binary_image_path='input-black-n-white.jpg'):
    im_gray = cv2.imread(captcha_path, cv2.CV_LOAD_IMAGE_GRAYSCALE)
    (thresh, im_bw) = cv2.threshold(im_gray, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
    # although thresh is used below, gonna pick something suitable
    im_bw = cv2.threshold(im_gray, thresh, 255, cv2.THRESH_BINARY)[1]
    cv2.imwrite(binary_image_path, im_bw)

    return binary_image_path

def preprocess_image_using_opencv(captcha_path):
    bin_image_path = binarize_image_using_opencv(captcha_path)

    im_bin = Image.open(bin_image_path)
    basewidth = 300  # in pixels
    wpercent = (basewidth/float(im_bin.size[0]))
    hsize = int((float(im_bin.size[1])*float(wpercent)))
    big = im_bin.resize((basewidth, hsize), Image.NEAREST)

    # tesseract-ocr only works with TIF so save the bigger image in that format
    tif_file = "input-NEAREST.tif"
    big.save(tif_file)

    return tif_file

def get_captcha_text_from_captcha_image(captcha_path):

    # Preprocess the image befor OCR
    tif_file = preprocess_image_using_opencv(captcha_path)

    #   Perform OCR using tesseract-ocr library
    # OCR : Optical Character Recognition
    image = Image.open(tif_file)
    ocr_text = image_to_string(image, config="-psm 6")
    alphanumeric_text = ''.join(e for e in ocr_text)

    return alphanumeric_text    

但我没有得到同样的准确度。我错过了什么?

But I am not getting the same accuracy. What did I miss?


  1. 原始图片


  2. 使用Gimp创建的Tif图像




更新2:



此代码位于 https://github.com/hussaintamboli/python-image-to-text

推荐答案

如果输出是o最低限度偏离预期产量(即额外的,等,如评论中所示)尝试将字符识别限制为您期望的字符集(例如字母数字)。

If the output is only minimally deviating from your expected output (i.e. extra '," etc. as suggested in your comments) try limiting character recognition to the character set you expect (e.g. alphanumeric).

这篇关于使用Gimp而不是我的Python代码手动预处理Image时,使用Tesseract-OCR进行文本识别的图像更好的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆