使用warpAffine一起显示拼接图像而不会中断 [英] Displaying stitched images together without cutoff using warpAffine

查看:95
本文介绍了使用warpAffine一起显示拼接图像而不会中断的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过模板匹配将2张图像缝合在一起,找到3组要传递给cv2.getAffineTransform()的点,获得一个扭曲矩阵,然后将其传递给cv2.warpAffine()来对齐我的图像.

I'm trying to stitch 2 images together by using template matching find 3 sets of points which I pass to cv2.getAffineTransform() get a warp matrix which I pass to cv2.warpAffine() into to align my images.

但是,当我加入图像时,大部分仿射图像都不会显示.我尝试使用不同的技术来选择点,更改顺序或参数等,但是我只能得到一幅薄薄的仿射图像来显示.

However when I join my images the majority of my affine'd image isn't shown. I've tried using different techniques to select points, changed the order or arguments etc. but I can only ever get a thin slither of the affine'd image to be shown.

有人可以告诉我我的方法是否有效,并建议我可能会出错的地方吗?对于可能导致问题的任何猜测,将不胜感激.提前致谢.

Could somebody tell me whether my approach is a valid one and suggest where I might be making an error? Any guesses as to what could be causing the problem would be greatly appreciated. Thanks in advance.

这是我得到的最终结果.这是原始图像( 1 2 )和我使用的代码:

This is the final result that I get. Here are the original images (1, 2) and the code that I use:

这是变量trans

array([[  1.00768049e+00,  -3.76690353e-17,  -3.13824885e+00],
       [  4.84461775e-03,   1.30769231e+00,   9.61912797e+02]])

这是传递给cv2.getAffineTransform的点:unified_pair1

array([[  671.,  1024.],
       [   15.,   979.],
       [   15.,   962.]], dtype=float32)

unified_pair2

array([[ 669.,   45.],
       [  18.,   13.],
       [  18.,    0.]], dtype=float32)


import cv2
import numpy as np


def showimage(image, name="No name given"):
    cv2.imshow(name, image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    return

image_a = cv2.imread('image_a.png')
image_b = cv2.imread('image_b.png')


def get_roi(image):
    roi = cv2.selectROI(image) # spacebar to confirm selection
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    crop = image_a[int(roi[1]):int(roi[1]+roi[3]), int(roi[0]):int(roi[0]+roi[2])]
    return crop
temp_1 = get_roi(image_a)
temp_2 = get_roi(image_a)
temp_3 = get_roi(image_a)

def find_template(template, search_image_a, search_image_b):
    ccnorm_im_a = cv2.matchTemplate(search_image_a, template, cv2.TM_CCORR_NORMED)
    template_loc_a = np.where(ccnorm_im_a == ccnorm_im_a.max())

    ccnorm_im_b = cv2.matchTemplate(search_image_b, template, cv2.TM_CCORR_NORMED)
    template_loc_b = np.where(ccnorm_im_b == ccnorm_im_b.max())
    return template_loc_a, template_loc_b


coord_a1, coord_b1 = find_template(temp_1, image_a, image_b)
coord_a2, coord_b2 = find_template(temp_2, image_a, image_b)
coord_a3, coord_b3 = find_template(temp_3, image_a, image_b)

def unnest_list(coords_list):
    coords_list = [a[0] for a in coords_list]
    return coords_list

coord_a1 = unnest_list(coord_a1)
coord_b1 = unnest_list(coord_b1)
coord_a2 = unnest_list(coord_a2)
coord_b2 = unnest_list(coord_b2)
coord_a3 = unnest_list(coord_a3)
coord_b3 = unnest_list(coord_b3)

def unify_coords(coords1,coords2,coords3):
    unified = []
    unified.extend([coords1, coords2, coords3])
    return unified

# Create a 2 lists containing 3 pairs of coordinates
unified_pair1 = unify_coords(coord_a1, coord_a2, coord_a3)
unified_pair2 = unify_coords(coord_b1, coord_b2, coord_b3)

# Convert elements of lists to numpy arrays with data type float32
unified_pair1 = np.asarray(unified_pair1, dtype=np.float32)
unified_pair2 = np.asarray(unified_pair2, dtype=np.float32)

# Get result of the affine transformation
trans = cv2.getAffineTransform(unified_pair1, unified_pair2)

# Apply the affine transformation to original image
result = cv2.warpAffine(image_a, trans, (image_a.shape[1] + image_b.shape[1], image_a.shape[0]))
result[0:image_b.shape[0], image_b.shape[1]:] = image_b

showimage(result)
cv2.imwrite('result.png', result)

来源:基于此处此处收到的建议的方法. com/2016/01/11/opencv-panorama-stitching/"rel =" nofollow noreferrer>教程和此

Sources: Approach based on advice received here, this tutorial and this example from the docs.

推荐答案

7月12日

这篇文章启发了GitHub存储库,提供了完成此任务的功能;一个用于填充的warpAffine(),另一个用于填充的warpPerspective().分叉 Python版本

July 12

This post inspired a GitHub repo providing functions to accomplish this task; one for a padded warpAffine() and another for a padded warpPerspective(). Fork the Python version or the C++ version.

任何变换所做的是获取点坐标(x, y)并将其映射到新位置(x', y'):

What any transformation does is takes your point coordinates (x, y) and maps them to new locations (x', y'):

s*x'    h1 h2 h3     x
s*y' =  h4 h5 h6  *  y
s       h7 h8  1     1

其中,s是一些比例因子.您必须将新坐标除以比例因子才能获得正确的像素位置(x', y').从技术上讲,这仅适用于单应性--- (3, 3)变换矩阵---您无需为仿射变换进行缩放(您甚至不需要使用齐次坐标...但是最好保留此讨论一般).

where s is some scaling factor. You must divide the new coordinates by the scale factor to get back the proper pixel locations (x', y'). Technically, this is only true of homographies---(3, 3) transformation matrices---you don't need to scale for affine transformations (you don't even need to use homogeneous coordinates...but it's better to keep this discussion general).

然后将实际像素值移动到那些新位置,然后对颜色值进行插值以适合新像素网格.因此,在此过程中,这些新位置会在某个时刻记录下来.我们需要这些位置来查看像素相对于其他图像实际移动到的位置.让我们从一个简单的示例开始,看看点在哪里映射.

Then the actual pixel values are moved to those new locations, and the color values are interpolated to fit the new pixel grid. So during this process, these new locations get recorded at some point. We'll need those locations to see where the pixels actually move to, relative to the other image. Let's start with an easy example and see where points are mapped.

假设您的变换矩阵只是将像素向左移动十个像素.翻译由最后一栏处理;第一行是x中的翻译,第二行是y中的翻译.因此,我们将具有一个单位矩阵,但在第一行第三列中包含-10.像素(0,0)会被映射到哪里?希望(-10,0)如果逻辑有意义.实际上,它确实:

Suppose your transformation matrix simply shifts pixels to the left by ten pixels. Translation is handled by the last column; the first row is the translation in x and second row is the translation in y. So we would have an identity matrix, but with -10 in the first row, third column. Where would the pixel (0,0) be mapped? Hopefully, (-10,0) if logic makes any sense. And in fact, it does:

transf = np.array([[1.,0.,-10.],[0.,1.,0.],[0.,0.,1.]])
homg_pt = np.array([0,0,1])
new_homg_pt = transf.dot(homg_pt))
new_homg_pt /= new_homg_pt[2]
# new_homg_pt = [-10.  0.  1.]

完美!因此,我们可以找出带有少量线性代数的 all 点在哪里映射.我们将需要获取所有(x,y)点,并将它们放入一个巨大的数组中,以便每个点都在其自己的列中.让我们假装我们的图像只是4x4.

Perfect! So we can figure out where all points map with a little linear algebra. We will need to get all the (x,y) points, and put them into a huge array so that every single point is in it's own column. Lets pretend our image is only 4x4.

h, w = src.shape[:2] # 4, 4
indY, indX = np.indices((h,w))  # similar to meshgrid/mgrid
lin_homg_pts = np.stack((indX.ravel(), indY.ravel(), np.ones(indY.size)))

这些lin_homg_pts现在具有所有同质点:

These lin_homg_pts have every homogenous point now:

[[ 0.  1.  2.  3.  0.  1.  2.  3.  0.  1.  2.  3.  0.  1.  2.  3.]
 [ 0.  0.  0.  0.  1.  1.  1.  1.  2.  2.  2.  2.  3.  3.  3.  3.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]]

然后我们可以做矩阵乘法以获得每个点的映射值.为简单起见,让我们继续使用以前的单应性.

Then we can do matrix multiplication to get the mapped value of every point. For simplicity, let's stick with the previous homography.

trans_lin_homg_pts = transf.dot(lin_homg_pts)
trans_lin_homg_pts /= trans_lin_homg_pts[2,:]

现在我们有了转换点:

[[-10. -9. -8. -7. -10. -9. -8. -7. -10. -9. -8. -7. -10. -9. -8. -7.]
 [  0.  0.  0.  0.   1.  1.  1.  1.   2.  2.  2.  2.   3.  3.  3.  3.]
 [  1.  1.  1.  1.   1.  1.  1.  1.   1.  1.  1.  1.   1.  1.  1.  1.]]

如我们所见,一切都按预期进行:我们仅将x值移位了-10.

As we can see, everything is working as expected: we have shifted the x-values only, by -10.

请注意,这些像素位置为负-位于图像边界之外.如果我们做一些更复杂的操作并将图像旋转45度,我们将获得一些超出原始范围的像素值.不过,我们并不关心每个像素值,我们只需要知道原始图像像素位置之外最远的像素有多远,这样我们就可以在显示变形图像之前将原始图像垫得很远.

Notice that these pixel locations are negative---they're outside of the image bounds. If we do something a little more complex and rotate the image by 45 degrees, we'll get some pixel values way outside our original bounds. We don't care about every pixel value though, we just need to know how far the farthest pixels are that are outside the original image pixel locations, so that we can pad the original image that far out, before displaying the warped image on it.

theta = 45*np.pi/180
transf = np.array([
    [ np.cos(theta),np.sin(theta),0],
    [-np.sin(theta),np.cos(theta),0],
    [0.,0.,1.]])
print(transf)
trans_lin_homg_pts = transf.dot(lin_homg_pts)
minX = np.min(trans_lin_homg_pts[0,:])
minY = np.min(trans_lin_homg_pts[1,:])
maxX = np.max(trans_lin_homg_pts[0,:])
maxY = np.max(trans_lin_homg_pts[1,:])
# minX: 0.0, minY: -2.12132034356, maxX: 4.24264068712, maxY: 2.12132034356,

因此,我们看到可以在正向和负向两个方向上将像素定位在原始图像之外.最小x值不变,因为当单应性应用旋转时,它从左上角开始旋转.现在要注意的一件事是,我已将变换应用于图像中的所有像素.但这确实是不必要的,您只需将四个角点翘曲并查看它们的降落位置即可.

So we see that we can get pixel locations well outside our original image, both in the negative and positive directions. The minimum x value doesn't change because when an homography applies a rotation, it does it from the top-left corner. Now one thing to note here is that I've applied the transformation to all pixels in the image. But this is really unnecessary, you can simply warp the four corner points and see where they land.

请注意,当您呼叫cv2.warpAffine()时,必须输入目的地尺寸.这些变换后的像素值参考该大小.因此,如果像素映射到(-10,0),它将不会显示在目标图像中.这意味着我们将不得不制作另一个单应性,其平移所有像素位置为正,然后我们可以填充图像矩阵以补偿我们的平移.如果单应性移动的点也指向大于图像的位置,我们还必须在原始图像的底部和右侧填充

Note that when you call cv2.warpAffine() you have to input the destination size. These transformed pixel values reference that size. So if a pixel gets mapped to (-10,0), it won't show up in the destination image. That means that we'll have to make another homography with translations which shift all pixel locations be positive, and then we can pad the image matrix to compensate for our shift. We'll also have to pad the original image on the bottom and the right if the homography moves points to positions bigger than the image, too.

在最近的示例中,最小x值是相同的,因此我们不需要水平移动.但是,min y的值下降了大约两个像素,因此我们需要将图像向下移动两个像素.首先,让我们创建填充的目标图像.

In the recent example, the min x value is the same, so we need no horizontal shift. However, the min y value has dropped by about two pixels, so we need to shift the image two pixels down. First, let's create the padded destination image.

pad_sz = list(src.shape) # in case three channel
pad_sz[0] = np.round(np.maximum(pad_sz[0], maxY) - np.minimum(0, minY)).astype(int)
pad_sz[1] = np.round(np.maximum(pad_sz[1], maxX) - np.minimum(0, minX)).astype(int)
dst_pad = np.zeros(pad_sz, dtype=np.uint8)
# pad_sz = [6, 4, 3]

我们可以看到,高度从原始高度增加了两个像素,以说明该偏移.

As we can see, the height increased from the original by two pixels to account for that shift.

现在,我们需要创建一个新的单应性矩阵,以将变形的图像平移相同的数量.而要应用这两种变换-原始变换和新变换-我们必须组成这两个同形异义词(对于仿射变换,您可以简单地添加翻译,但不对单应性进行翻译) .另外,我们需要除以最后一个条目,以确保比例仍然正确(再次,仅对于单应性):

Now, we need to create a new homography matrix to translate the warped image by the same amount that we shifted by. And to apply both transformations---the original and this new shift---we have to compose the two homographies (for an affine transformation, you can simply add the translation, but not for an homography). Additionally we need to divide by the last entry to make sure the scales are still proper (again, only for homographies):

anchorX, anchorY = 0, 0
transl_transf = np.eye(3,3)
if minX < 0: 
    anchorX = np.round(-minX).astype(int)
    transl_transf[0,2] -= anchorX
if minY < 0:
    anchorY = np.round(-minY).astype(int)
    transl_transf[1,2] -= anchorY
new_transf = transl_transf.dot(transf)
new_transf /= new_transf[2,2]

我还在此处创建了将目标图像放置到填充矩阵中的锚点;它的移动量与单应性将移动图像的量相同.因此,我们将目标图像放置在填充的矩阵内:

I also created here the anchor points for where we will place the destination image into the padded matrix; it's shifted by the same amount the homography will shift the image. So let's place the destination image inside the padded matrix:

dst_pad[anchorY:anchorY+dst_sz[0], anchorX:anchorX+dst_sz[1]] = dst

将新的转换转换为填充图像

剩下要做的就是将新的转换应用于源图像(具有填充的目标大小),然后我们可以覆盖这两个图像.

Warp with the new transformation into the padded image

All we have left to do is apply the new transformation to the source image (with the padded destination size), and then we can overlay the two images.

warped = cv2.warpPerspective(src, new_transf, (pad_sz[1],pad_sz[0]))

alpha = 0.3
beta = 1 - alpha
blended = cv2.addWeighted(warped, alpha, dst_pad, beta, 1.0)

将它们放在一起

让我们为此创建一个函数,因为我们创建了很多我们最后不需要的变量.对于输入,我们需要源图像,目标图像和原始单应性.对于输出,我们只需要填充的目标图像和变形的图像.请注意,在示例中,我们使用了3x3单应性,因此我们最好确保发送3x3变换而不是2x3仿射或欧几里德扭曲.您只需将[0,0,1]行添加到底部的任何仿射经线上,就可以了.

Putting it all together

Let's create a function for this since we were creating quite a few variables we don't need at the end here. For inputs we need the source image, the destination image, and the original homography. And for outputs we simply want the padded destination image, and the warped image. Note that in the examples we used a 3x3 homography so we better make sure we send in 3x3 transforms instead of 2x3 affine or Euclidean warps. You can just add the row [0,0,1] to any affine warp at the bottom and you'll be fine.

def warpPerspectivePadded(img, dst, transf):

    src_h, src_w = src.shape[:2]
    lin_homg_pts = np.array([[0, src_w, src_w, 0], [0, 0, src_h, src_h], [1, 1, 1, 1]])

    trans_lin_homg_pts = transf.dot(lin_homg_pts)
    trans_lin_homg_pts /= trans_lin_homg_pts[2,:]

    minX = np.min(trans_lin_homg_pts[0,:])
    minY = np.min(trans_lin_homg_pts[1,:])
    maxX = np.max(trans_lin_homg_pts[0,:])
    maxY = np.max(trans_lin_homg_pts[1,:])

    # calculate the needed padding and create a blank image to place dst within
    dst_sz = list(dst.shape)
    pad_sz = dst_sz.copy() # to get the same number of channels
    pad_sz[0] = np.round(np.maximum(dst_sz[0], maxY) - np.minimum(0, minY)).astype(int)
    pad_sz[1] = np.round(np.maximum(dst_sz[1], maxX) - np.minimum(0, minX)).astype(int)
    dst_pad = np.zeros(pad_sz, dtype=np.uint8)

    # add translation to the transformation matrix to shift to positive values
    anchorX, anchorY = 0, 0
    transl_transf = np.eye(3,3)
    if minX < 0: 
        anchorX = np.round(-minX).astype(int)
        transl_transf[0,2] += anchorX
    if minY < 0:
        anchorY = np.round(-minY).astype(int)
        transl_transf[1,2] += anchorY
    new_transf = transl_transf.dot(transf)
    new_transf /= new_transf[2,2]

    dst_pad[anchorY:anchorY+dst_sz[0], anchorX:anchorX+dst_sz[1]] = dst

    warped = cv2.warpPerspective(src, new_transf, (pad_sz[1],pad_sz[0]))

    return dst_pad, warped

运行功能的示例

最后,我们可以用一些真实的图像和单应性来调用此函数,并查看它如何显示.我将从 LearnOpenCV

src = cv2.imread('book2.jpg')
pts_src = np.array([[141, 131], [480, 159], [493, 630],[64, 601]], dtype=np.float32)
dst = cv2.imread('book1.jpg')
pts_dst = np.array([[318, 256],[534, 372],[316, 670],[73, 473]], dtype=np.float32)

transf = cv2.getPerspectiveTransform(pts_src, pts_dst)

dst_pad, warped = warpPerspectivePadded(src, dst, transf)

alpha = 0.5
beta = 1 - alpha
blended = cv2.addWeighted(warped, alpha, dst_pad, beta, 1.0)
cv2.imshow("Blended Warped Image", blended)
cv2.waitKey(0)

最后我们得到了这个填充后的扭曲图像:

And we end up with this padded warped image:

与通常会得到的典型的切断翘曲相反.

as opposed to the typical cut off warp you would normally get.

这篇关于使用warpAffine一起显示拼接图像而不会中断的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆