如何去歪斜文本图像也检索该图像的新边界框? [英] How to de-skew a text image also retrieve the new bounding box of that image?

查看:25
本文介绍了如何去歪斜文本图像也检索该图像的新边界框?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我得到的一张收据图像,我使用 matplotlib 绘制了它,如果您看到图像,其中的文本不是直的.我该如何去歪斜并修复它?

from skimage import io导入 cv2# x1, y1, x2, y2, x3, y3, x4, y4bbox_coords = [[20, 68], [336, 68], [336, 100], [20, 100]]image = io.imread('https://i.ibb.co/3WCsVBc/test.jpg')灰色 = cv2.cvtColor(图像,cv2.COLOR_RGB2GRAY)图, ax = plt.subplots(figsize=(20, 20))ax.imshow(灰色,cmap='Greys_r')# 用于绘制边界框取消注释下面的两行#rect = Polygon(bbox_coords, fill=False, linewidth=1, edgecolor='r')#ax.add_patch(rect)plt.show()打印(灰色.形状)(847, 486)

我认为如果我们想先去歪斜,我们必须找到边缘,所以我尝试使用canny算法找到边缘,然后得到如下所示的轮廓.

from skimage 导入过滤器、特征、度量定义边缘检测器(图像):图像 = filters.gaussian(图像,2,模式='反射')边缘 = 特征.canny(图像)轮廓 = measure.find_contours(edges, 0.8)返回边缘、轮廓图, ax = plt.subplots(figsize=(20, 20))ax.imshow(gray, cmap='Greys_r');灰色图像,轮廓=边缘检测器(灰色)对于 n,enumerate(contours) 中的轮廓:ax.plot(contour[:, 1],contour[:, 0], linewidth=2)

我从上面的代码中得到的边缘是每个文本的边缘,但这不是我需要的.我需要正确处理收据的边缘?

我还需要一种方法来在对图像进行去倾斜(即拉直图像)后获取新的边界框坐标?

如果有人解决过类似的问题,请帮助我?谢谢.

解决方案

这里是 Projection Profile Method 的修改实现,用于校正 用于另一种方法

Here's a receipt image that I've got and I've plotted it using matplotlib and If you see the image the text in it is not straight. How can I de-skew and fix it?

from skimage import io
import cv2

# x1, y1, x2, y2, x3, y3, x4, y4
bbox_coords = [[20, 68], [336, 68], [336, 100], [20, 100]]

image = io.imread('https://i.ibb.co/3WCsVBc/test.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)

fig, ax = plt.subplots(figsize=(20, 20))
ax.imshow(gray, cmap='Greys_r')

# for plotting bounding box uncomment the two lines below
#rect = Polygon(bbox_coords, fill=False, linewidth=1, edgecolor='r')
#ax.add_patch(rect)
plt.show()

print(gray.shape)
(847, 486)

I think if we want to de-skew first we have to find the edges, so I tried to find the edges using canny algorithm and then get contours like below.

from skimage import filters, feature, measure

def edge_detector(image):
    image = filters.gaussian(image, 2, mode='reflect')
    edges = feature.canny(image)
    contours = measure.find_contours(edges, 0.8)
    return edges, contours

fig, ax = plt.subplots(figsize=(20, 20))

ax.imshow(gray, cmap='Greys_r'); 
gray_image, contours = edge_detector(gray)

for n, contour in enumerate(contours):
    ax.plot(contour[:, 1], contour[:, 0], linewidth=2)

The edges that I've got from above code is the edges of each text but that is not what I needed. I need to get edges of receipt right?

Also I need a way to get the new bounding box coordinates after de-skewing the image (i.e straightening the image)?

If anyone has worked on similar problem please help me out? Thanks.

解决方案

Here's a modified implementation of the Projection Profile Method to correct skewed images as described in Projection profile based skew estimation algorithm for JBIG compressed images. After obtaining a binary image, the idea is to rotate the image at various angles and generate a histogram of pixels in each iteration. To determine the skew angle, we compare the maximum difference between peaks and using this skew angle, rotate the image to correct the skew. The amount of peaks to determine can be adjusted by the delta value, the lower the delta, the more peaks will be checked with the tradeoff that the process will take longer.


Before -> After

Code

import cv2
import numpy as np
from scipy.ndimage import interpolation as inter

def correct_skew(image, delta=.1, limit=5):
    def determine_score(arr, angle):
        data = inter.rotate(arr, angle, reshape=False, order=0)
        histogram = np.sum(data, axis=1)
        score = np.sum((histogram[1:] - histogram[:-1]) ** 2)
        return histogram, score

    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    blur = cv2.medianBlur(gray, 3)
    thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1] 

    scores = []
    angles = np.arange(-limit, limit + delta, delta)
    for angle in angles:
        histogram, score = determine_score(thresh, angle)
        scores.append(score)

    best_angle = angles[scores.index(max(scores))]

    (h, w) = image.shape[:2]
    center = (w // 2, h // 2)
    M = cv2.getRotationMatrix2D(center, best_angle, 1.0)
    rotated = cv2.warpAffine(image, M, (w, h), flags=cv2.INTER_CUBIC, 
              borderMode=cv2.BORDER_REPLICATE)

    return best_angle, rotated

if __name__ == '__main__':
    image = cv2.imread('1.jpg')
    angle, rotated = correct_skew(image)
    print(angle)
    cv2.imshow('rotated', rotated)
    cv2.imwrite('rotated.png', rotated)
    cv2.waitKey()

Note: Also take a look at rotated skewed image to upright position for another approach

这篇关于如何去歪斜文本图像也检索该图像的新边界框?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆