如何使矩形对象变形以适合其较大的边界框 [英] How to warp a rectangular object to fit its larger bounding box

查看:89
本文介绍了如何使矩形对象变形以适合其较大的边界框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给出此图片:

我想使其旋转并拉伸以完全适合边界框,而最大矩形框的外部没有空白.像我稍后列出的链接一样,它也应该考虑更糟的透视情况.

I'd like to make it such that it will rotate and stretch to fully fit in the bounding box with no whitespace on the outside of the largest rectangular box. It should also account for worse perspective case, like in the links I list later on.

基本上,虽然它并不明显,但将矩形稍微旋转了一点,我想修复该变形.

Basically, while it is not noticeable, the rectangle is rotated a little bit, and I'd like to fix that distortion.

但是,尝试检索轮廓的四个点时出现错误.我已经确定并利用轮廓逼近来隔离并仅获得相关外观的轮廓,并且您可以在图像中看到它是成功的,尽管我不能在其上使用透视扭曲.

However, I got an error when attempting to retrieve the four points of the contour. I have made sure and utilized contour approximation to isolate and get only relevant looking contours and as you can see in the image it's successful, though I can't use perspective warp on it.

我已经在这里尝试了链接:

I've already tried the links here:

  • How to straighten a rotated rectangle area of an image using opencv in python?
  • https://www.pyimagesearch.com/2014/05/05/building-pokedex-python-opencv-perspective-warping-step-5-6/
  • https://www.pyimagesearch.com/2014/09/01/build-kick-ass-mobile-document-scanner-just-5-minutes/

然后按照它们进行修改,只进行一些细微的修改(例如不先缩小图像,然后再放大图像)和不同的输入图像.

And followed them, with only minor modifications (like not downscaling the image and then upscaling it) and different input image.

那里的读者遇到类似的错误,但是作者只是说要使用轮廓逼近.我做到了,但仍然收到相同的错误.

There is a similar error encountered by a reader there in the comments, but the author just said to use contour approximation. I did that but I still receive the same error.

我已经检索了轮廓(连同其边界框,是前面显示的图像),并使用此代码尝试了迫切变形:

I have already retrieved the contour (which along with its bounding box, is the image illustrated earlier), and used this code to attempt persective warp:

def warp_perspective(cnt):
    # reshape cnt to get tl, tr, br, bl points
    pts = cnt.reshape(4, 2)
    rect = np.zeros((4, 2), dtype="float32")

    s = pts.sum(axis=1)
    rect[0] = pts[np.argmin(s)]
    rect[2] = pts[np.argmin(s)]

    diff = np.diff(pts, axis=1)
    rect[1] = pts[np.argmin(diff)]
    rect[2] = pts[np.argmax(diff)]

    # solve for the width of the image
    (tl, tr, br, bl) = rect
    widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
    widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))

    # solve for the height of the image
    heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
    heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))

    # get the final dimensions
    maxWidth = max(int(widthA), int(widthB))
    maxHeight = max(int(heightA), int(heightB))

    # construct the dst image
    dst = np.array([
        [0, 0],
        [maxWidth - 1, 0],
        [maxWidth - 1, maxHeight - 1],
        [0, maxHeight - 1]], dtype="float32")

    # calculate perspective transform matrix
    # warp the perspective
    M = cv2.getPerspectiveTransform(rect, dst)
    warp = cv2.warpPerspective(orig, M, (maxWidth, maxHeight))

    cv2.imshow("warped", warp)

    return warp

该函数接受cnt作为单个轮廓.

The function accepts cnt as a single contour.

运行时,我遇到了前面提到的错误:

Upon running I ran into this error I mentioned earlier:

in warp_perspective
    pts = cnt.reshape(4, 2)
ValueError: cannot reshape array of size 2090 into shape (4,2)

我一点都不明白.我已经成功地隔离并检索了正确的轮廓和边界框,而我所做的唯一不同就是跳过了缩小比例..

Which I do not understand at all. I have successfully isolated and retrieved the correct contour and bounding box, and the only thing I did differently was to skip downscaling..

推荐答案

尝试以下方法:

  • 使用双边滤镜将图像转换为灰度并模糊
  • 大津的门槛
  • 找到轮廓
  • 对最大的正方形轮廓进行轮廓逼近
  • 透视变换和旋转

结果

import cv2
import numpy as np
import imutils

def perspective_transform(image, corners):
    def order_corner_points(corners):
        # Separate corners into individual points
        # Index 0 - top-right
        #       1 - top-left
        #       2 - bottom-left
        #       3 - bottom-right
        corners = [(corner[0][0], corner[0][1]) for corner in corners]
        top_r, top_l, bottom_l, bottom_r = corners[0], corners[1], corners[2], corners[3]
        return (top_l, top_r, bottom_r, bottom_l)

    # Order points in clockwise order
    ordered_corners = order_corner_points(corners)
    top_l, top_r, bottom_r, bottom_l = ordered_corners

    # Determine width of new image which is the max distance between 
    # (bottom right and bottom left) or (top right and top left) x-coordinates
    width_A = np.sqrt(((bottom_r[0] - bottom_l[0]) ** 2) + ((bottom_r[1] - bottom_l[1]) ** 2))
    width_B = np.sqrt(((top_r[0] - top_l[0]) ** 2) + ((top_r[1] - top_l[1]) ** 2))
    width = max(int(width_A), int(width_B))

    # Determine height of new image which is the max distance between 
    # (top right and bottom right) or (top left and bottom left) y-coordinates
    height_A = np.sqrt(((top_r[0] - bottom_r[0]) ** 2) + ((top_r[1] - bottom_r[1]) ** 2))
    height_B = np.sqrt(((top_l[0] - bottom_l[0]) ** 2) + ((top_l[1] - bottom_l[1]) ** 2))
    height = max(int(height_A), int(height_B))

    # Construct new points to obtain top-down view of image in 
    # top_r, top_l, bottom_l, bottom_r order
    dimensions = np.array([[0, 0], [width - 1, 0], [width - 1, height - 1], 
                    [0, height - 1]], dtype = "float32")

    # Convert to Numpy format
    ordered_corners = np.array(ordered_corners, dtype="float32")

    # Find perspective transform matrix
    matrix = cv2.getPerspectiveTransform(ordered_corners, dimensions)

    # Transform the image
    transformed = cv2.warpPerspective(image, matrix, (width, height))

    # Rotate and return the result
    return imutils.rotate_bound(transformed, angle=-90)

image = cv2.imread('1.png')
original = image.copy()
blur = cv2.bilateralFilter(image,9,75,75)
gray = cv2.cvtColor(blur, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray,0,255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

for c in cnts:
    peri = cv2.arcLength(c, True)
    approx = cv2.approxPolyDP(c, 0.015 * peri, True)

    if len(approx) == 4:
        cv2.drawContours(image,[c], 0, (36,255,12), 3)
        transformed = perspective_transform(original, approx)

cv2.imshow('thresh', thresh)
cv2.imshow('image', image)
cv2.imshow('transformed', transformed)
cv2.waitKey()

这篇关于如何使矩形对象变形以适合其较大的边界框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆