适用于OCR的Python OpenCV偏斜校正 [英] Python OpenCV skew correction for OCR

查看:1276
本文介绍了适用于OCR的Python OpenCV偏斜校正的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当前,我正在一个OCR项目中,我需要从标签上读取文本(请参见下面的示例图像).我遇到了图像歪斜的问题,我需要帮助修复图像歪斜,以便文本是水平的而不是倾斜的.目前,我正在使用该过程尝试从给定范围内对不同角度进行评分(以下代码包括在内),但是此方法不一致,有时会过度校正图像歪斜或平坦,无法识别出歪斜并对其进行校正.请注意,在偏斜校正之前,我将所有图像旋转270度以使文本直立,然后将图像通过下面的代码传递.传递给函数的图像已经是二进制图像.

Currently, I am working on an OCR project where I need to read the text off of a label (see example images below). I am running into issues with the image skew and I need help fixing the image skew so the text is horizontal and not at an angle. Currently the process I am using attempts to score different angles from a given range (code included below), but this method is inconsistent and sometimes overcorrects an image skew or flat out fails to identify the skew and correct it. Just as a note, before the skew correction I am rotating all of the images by 270 degrees to get the text upright, then I am passing the image through the code below. The image passed through to the function is already a binary image.

代码:


def findScore(img, angle):
    """
    Generates a score for the binary image recieved dependent on the determined angle.\n
    Vars:\n
    - array <- numpy array of the label\n
    - angle <- predicted angle at which the image is rotated by\n
    Returns:\n
    - histogram of the image
    - score of potential angle
    """
    data = inter.rotate(img, angle, reshape = False, order = 0)
    hist = np.sum(data, axis = 1)
    score = np.sum((hist[1:] - hist[:-1]) ** 2)
    return hist, score

def skewCorrect(img):
    """
    Takes in a nparray and determines the skew angle of the text, then corrects the skew and returns the corrected image.\n
    Vars:\n
    - img <- numpy array of the label\n
    Returns:\n
    - Corrected image as a numpy array\n
    """
    #Crops down the skewImg to determine the skew angle
    img = cv2.resize(img, (0, 0), fx = 0.75, fy = 0.75)

    delta = 1
    limit = 45
    angles = np.arange(-limit, limit+delta, delta)
    scores = []
    for angle in angles:
        hist, score = findScore(img, angle)
        scores.append(score)
    bestScore = max(scores)
    bestAngle = angles[scores.index(bestScore)]
    rotated = inter.rotate(img, bestAngle, reshape = False, order = 0)
    print("[INFO] angle: {:.3f}".format(bestAngle))
    #cv2.imshow("Original", img)
    #cv2.imshow("Rotated", rotated)
    #cv2.waitKey(0)

    #Return img
    return rotated

校正前后标签的示例图像

Example images of the label before correction and after

  • Before correction:https://imgur.com/CO32WLn
  • After correction: https://imgur.com/XRaJ9Bz

如果有人可以帮助我解决这个问题,那将有很大帮助.

If anyone can help me figure this problem out, it would be of much help.

推荐答案

这是Projection Profile Method的一种确定倾斜的实现.获得二进制图像后,其想法是将图像旋转各种角度并在每次迭代中生成像素的直方图.为了确定偏斜角,我们比较峰之间的最大差异,并使用该偏斜角旋转图像以校正偏斜

Here's an implementation of the Projection Profile Method to determine skew. After obtaining a binary image, the idea is rotate the image at various angles and generate a histogram of pixels in each iteration. To determine the skew angle, we compare the maximum difference between peaks and using this skew angle, rotate the image to correct the skew

左(原始),右(校正)

Left (original), Right (corrected)

import cv2
import numpy as np
from scipy.ndimage import interpolation as inter

def correct_skew(image, delta=1, limit=5):
    def determine_score(arr, angle):
        data = inter.rotate(arr, angle, reshape=False, order=0)
        histogram = np.sum(data, axis=1)
        score = np.sum((histogram[1:] - histogram[:-1]) ** 2)
        return histogram, score

    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1] 

    scores = []
    angles = np.arange(-limit, limit + delta, delta)
    for angle in angles:
        histogram, score = determine_score(thresh, angle)
        scores.append(score)

    best_angle = angles[scores.index(max(scores))]

    (h, w) = image.shape[:2]
    center = (w // 2, h // 2)
    M = cv2.getRotationMatrix2D(center, best_angle, 1.0)
    rotated = cv2.warpAffine(image, M, (w, h), flags=cv2.INTER_CUBIC, \
              borderMode=cv2.BORDER_REPLICATE)

    return best_angle, rotated

if __name__ == '__main__':
    image = cv2.imread('1.png')
    angle, rotated = correct_skew(image)
    print(angle)
    cv2.imshow('rotated', rotated)
    cv2.imwrite('rotated.png', rotated)
    cv2.waitKey()

这篇关于适用于OCR的Python OpenCV偏斜校正的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆