计算透视变换目标图像的纵横比 [英] Calculating aspect ratio of Perspective Transform destination image

查看:34
本文介绍了计算透视变换目标图像的纵横比的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近在 OpenCV 中为我在 Android 中的应用实现了透视变换.几乎所有事情都没有问题,但有一个方面需要做更多的工作.

问题是我不知道如何计算Perspective Transform的目标图像的正确纵横比(不必手动设置),以便可以计算图像的纵横比对尺寸真实的事物/图像尽管相机的角度.注意起始坐标不是梯形,而是四边形.

如果我有一张从大约 45 度角拍摄的书的照片,并且我希望目标图像的纵横比与这本书的纵横比几乎相同.拥有 2D 照片很难,但 CamScanner 应用程序完美地做到了.我已经做了一个非常简单的方法来计算我的目标图像的大小(不期望它按我的意愿工作),但它使 45 度角的图像缩短了大约 20%,并且当降低角度时图像高度会降低值得注意的是,尽管角度不同,CamScanner 也能做到完美:

在这里,CamScanner 保持目标图像(第二张)的纵横比与书本的相同,即使在大约 20 度的角度也能做到非常准确.

与此同时,我的代码看起来像这样(在计算目标图像的大小时,我无意让它像我在这个问题中所问的那样工作):

public static Mat PerspectiveTransform(Point[]cropCoordinates, float ratioW, float ratioH, Bitmap croppedImage){if (cropCoordinates.length != 4) 返回空值;双倍宽度1、宽度2、高度1、高度2、平均宽度、平均宽度;Mat src = new Mat();列表<点>startCoords = new ArrayList<>();列表<点>resultCoords = new ArrayList<>();Utils.bitmapToMat(croppedImage, src);for (int i = 0; i <4; i++){if (cropCoordinates[i].y <0 ) new Point(cropCoordinates[i].x, 0);startCoords.add(new Point(cropCoordinates[i].x * ratioW,cropCoordinates[i].y * ratioH));}width1 = Math.sqrt(Math.pow(startCoords.get(2).x - startCoords.get(3).x,2) + Math.pow(startCoords.get(2).y - startCoords.get(3).y,2));width2 = Math.sqrt(Math.pow(startCoords.get(1).x - startCoords.get(0).x,2) + Math.pow(startCoords.get(1).y - startCoords.get(0).y,2));height1 = Math.sqrt(Math.pow(startCoords.get(1).x - startCoords.get(2).x, 2) + Math.pow(startCoords.get(1).y - startCoords.get(2).y, 2));height2 = Math.sqrt(Math.pow(startCoords.get(0).x - startCoords.get(3).x, 2) + Math.pow(startCoords.get(0).y ​​- startCoords.get(3).y, 2));avgw = (width1 + width2)/2;avgh = (height1 + height2)/2;resultCoords.add(new Point(0, 0));resultCoords.add(new Point(avgw-1, 0));resultCoords.add(new Point(avgw-1, avgh-1));resultCoords.add(new Point(0, avgh-1));Mat start = Converters.vector_Point2f_to_Mat(startCoords);Mat 结果 = Converters.vector_Point2d_to_Mat(resultCoords);start.convertTo(start, CvType.CV_32FC2);result.convertTo(result,CvType.CV_32FC2);Mat mat = new Mat();垫透视 = Imgproc.getPerspectiveTransform(start, result);imgproc.warpPerspective(src, mat, 透视, 新尺寸(avgw, avgh));返回垫;}

从相对相同的角度,我的方法产生了这个结果:

我想知道怎么做? 有趣的是,他们是如何仅通过 4 个角的坐标来计算对象的长度的.另外,如果可能,请提供一些代码/数学解释或类似/相同事物的文章.

提前致谢.

解决方案

这之前在 SO 上出现过几次,但我从来没有看到完整的答案,所以在这里.此处显示的实现基于导出完整方程的这篇论文:

投影(分辨率非常低,因为我从您的屏幕截图中裁剪了图像,但纵横比似乎正确):

I've recently implemented Perspective Transform in OpenCV to my app in Android. Almost everything works without issues but one aspect needs much more work to be done.

The problem is that I do not know how to count the right aspect ratio of the destination image of Perspective Transform (it does not have to be set manually), so that it could count the aspect ratio of the image to the size of the real thing/image despite the angle of a camera. Note that the starting coordinates do not form trapezoid, it does form a quadrangle.

If I have a photograph of a book taken from approximately 45 degrees and I want the destination image aspect ratio to be pretty much the same as this book's aspect ratio is. It is hard to do having a 2D photo, but CamScanner app does it perfectly. I've made very simple way to count the size of my destination image (with no expectations for it to work as I want), but it makes the image from 45 degree angle about 20% shorter and when lowering the angle the image height reduces significantly, while CamScanner does it perfectly despite the angle:

Here, CamScanner maintains the aspect ratio of the destination image (second one) the same as the book's, it did pretty accurately even at ~20 degree angle.

Meanwhile, my code looks like this (while counting sizes of destination image I have no intention for it to work as I ask in this question):

public static Mat PerspectiveTransform(Point[] cropCoordinates, float ratioW, float ratioH, Bitmap croppedImage)
{
    if (cropCoordinates.length != 4) return null;

    double width1, width2, height1, height2, avgw, avgh;
    Mat src = new Mat();
    List<Point> startCoords = new ArrayList<>();
    List<Point> resultCoords = new ArrayList<>();

    Utils.bitmapToMat(croppedImage, src);

    for (int i = 0; i < 4; i++)
    {
        if (cropCoordinates[i].y < 0 ) new Point(cropCoordinates[i].x, 0);
        startCoords.add(new Point(cropCoordinates[i].x * ratioW, cropCoordinates[i].y * ratioH));
    }

    width1 = Math.sqrt(Math.pow(startCoords.get(2).x - startCoords.get(3).x,2) + Math.pow(startCoords.get(2).y - startCoords.get(3).y,2));
    width2 = Math.sqrt(Math.pow(startCoords.get(1).x - startCoords.get(0).x,2) + Math.pow(startCoords.get(1).y - startCoords.get(0).y,2));
    height1 = Math.sqrt(Math.pow(startCoords.get(1).x - startCoords.get(2).x, 2) + Math.pow(startCoords.get(1).y - startCoords.get(2).y, 2));
    height2 = Math.sqrt(Math.pow(startCoords.get(0).x - startCoords.get(3).x, 2) + Math.pow(startCoords.get(0).y - startCoords.get(3).y, 2));
    avgw = (width1 + width2) / 2;
    avgh = (height1 + height2) / 2;

    resultCoords.add(new Point(0, 0));
    resultCoords.add(new Point(avgw-1, 0));
    resultCoords.add(new Point(avgw-1, avgh-1));
    resultCoords.add(new Point(0, avgh-1));

    Mat start = Converters.vector_Point2f_to_Mat(startCoords);
    Mat result = Converters.vector_Point2d_to_Mat(resultCoords);
    start.convertTo(start, CvType.CV_32FC2);
    result.convertTo(result,CvType.CV_32FC2);

    Mat mat = new Mat();
    Mat perspective = Imgproc.getPerspectiveTransform(start, result);
    Imgproc.warpPerspective(src, mat, perspective, new Size(avgw, avgh));

    return mat;
}

And from relatively the same angle my method produces this result:

What I want to know is how it is possible to do? It is interesting for me how did they manage to count the length of the object just by having coordinates of 4 corners. Also, if it is possible, please provide some code/ mathematical explanations or articles of similar/same thing.

Thank you in advance.

解决方案

This has come up a few times before on SO but I've never seen a full answer, so here goes. The implementation shown here is based on this paper which derives the full equations: http://research.microsoft.com/en-us/um/people/zhang/papers/tr03-39.pdf

Essentially, it shows that assuming a pinhole camera model, it is possible to calculate the aspect ratio for a projected rectangle (but not the scale, unsurprisingly). Essentially, one can solve for the focal length, then get the aspect ratio. Here's a sample implementation in python using OpenCV. Note that you need to have the 4 detected corners in the right order or it won't work (note the order, it is a zigzag). The reported error rates are in the 3-5% range.

import math
import cv2
import scipy.spatial.distance
import numpy as np

img = cv2.imread('img.png')
(rows,cols,_) = img.shape

#image center
u0 = (cols)/2.0
v0 = (rows)/2.0

#detected corners on the original image
p = []
p.append((67,74))
p.append((270,64))
p.append((10,344))
p.append((343,331))

#widths and heights of the projected image
w1 = scipy.spatial.distance.euclidean(p[0],p[1])
w2 = scipy.spatial.distance.euclidean(p[2],p[3])

h1 = scipy.spatial.distance.euclidean(p[0],p[2])
h2 = scipy.spatial.distance.euclidean(p[1],p[3])

w = max(w1,w2)
h = max(h1,h2)

#visible aspect ratio
ar_vis = float(w)/float(h)

#make numpy arrays and append 1 for linear algebra
m1 = np.array((p[0][0],p[0][1],1)).astype('float32')
m2 = np.array((p[1][0],p[1][1],1)).astype('float32')
m3 = np.array((p[2][0],p[2][1],1)).astype('float32')
m4 = np.array((p[3][0],p[3][1],1)).astype('float32')

#calculate the focal disrance
k2 = np.dot(np.cross(m1,m4),m3) / np.dot(np.cross(m2,m4),m3)
k3 = np.dot(np.cross(m1,m4),m2) / np.dot(np.cross(m3,m4),m2)

n2 = k2 * m2 - m1
n3 = k3 * m3 - m1

n21 = n2[0]
n22 = n2[1]
n23 = n2[2]

n31 = n3[0]
n32 = n3[1]
n33 = n3[2]

f = math.sqrt(np.abs( (1.0/(n23*n33)) * ((n21*n31 - (n21*n33 + n23*n31)*u0 + n23*n33*u0*u0) + (n22*n32 - (n22*n33+n23*n32)*v0 + n23*n33*v0*v0))))

A = np.array([[f,0,u0],[0,f,v0],[0,0,1]]).astype('float32')

At = np.transpose(A)
Ati = np.linalg.inv(At)
Ai = np.linalg.inv(A)

#calculate the real aspect ratio
ar_real = math.sqrt(np.dot(np.dot(np.dot(n2,Ati),Ai),n2)/np.dot(np.dot(np.dot(n3,Ati),Ai),n3))

if ar_real < ar_vis:
    W = int(w)
    H = int(W / ar_real)
else:
    H = int(h)
    W = int(ar_real * H)

pts1 = np.array(p).astype('float32')
pts2 = np.float32([[0,0],[W,0],[0,H],[W,H]])

#project the image with the new w/h
M = cv2.getPerspectiveTransform(pts1,pts2)

dst = cv2.warpPerspective(img,M,(W,H))

cv2.imshow('img',img)
cv2.imshow('dst',dst)
cv2.imwrite('orig.png',img)
cv2.imwrite('proj.png',dst)

cv2.waitKey(0)

Original:

Projected (the resolution is very low since I cropped the image from your screenshot, but the aspect ratio seems correct):

这篇关于计算透视变换目标图像的纵横比的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆