OpenCV - 投影、单应矩阵和鸟瞰图 [英] OpenCV - Projection, homography matrix and bird's eye view

查看:108
本文介绍了OpenCV - 投影、单应矩阵和鸟瞰图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想获得鸟瞰图的单应矩阵,并且我知道相机的投影矩阵.他们之间有什么关系吗?

I'd like to get homography matrix to Bird's eye view and I know the projection Matrix of the camera. Is there any relation between them?

谢谢.

推荐答案

投影矩阵被定义为相机的内在(例如焦距、主点等)和外在(旋转和平移)矩阵的乘积.问题是你的旋转和平移是什么?例如,我可以想象另一台相机或一个 3D 对象,你有这些旋转和平移.否则你的投影只是一个内在矩阵.

A projection matrix is defined as a product of camera's intrinsic (e.g. focal length, principal points, etc.) and extrinsic (rotation and translation) matrices. The question is w.r.t what your rotation and translation are? For example, I can imagine another camera or an object in 3D with respect to which you have these rotations and translations. Otherwise your projection is just an intrinsic matrix.

首先考虑获得鸟瞰图所需了解的信息:您至少需要知道您的相机是如何相对于地面定向的.如果您还知道相机高程,则可以创建公制重建.但是既然你提到了单应性,我假设你考虑一个平面的鸟瞰图,因为 单应性将点映射到两个平面上,在你的情况下,平面上的点到点在您的平面传感器上.

Think first about the pieces of information you need to know to obtain a bird’s eye view: you need to know at least how your camera is oriented w.r.t ground surface. If you also know camera elevation you can create a metric reconstruction. But since you mentioned a homography, I assume that you consider a bird’s eye view of a flat surface since a homography maps the points on two flat surfaces, in your case the points on a flat ground to the points on your flat sensor.

让我们考虑一个针孔相机方程.它基本上说[u, v, 1]T ~ A*[R|t][x, y, z, 1]T,其中A是相机内在矩阵.现在,由于您处理的是地面平面,您可以通过设置 z=0 将新坐标系与其对齐;R|t 是从这个坐标系到你的相机对齐系统的旋转和平移矩阵;

Let’s consider a pinhole camera equation. It basically says that [u, v, 1]T ~ A*[R|t][x, y, z, 1]T, where A is a camera intrinsic matrix. Now since you deal with a ground plane, you can align a new coordinate system with it by setting z=0; R|t are rotation and translation matrices from this coordinate system into your camera-aligned system;

接下来,请注意您的 R|t 是一个 3x4 矩阵,并且由于 z=0,它失去了一维;它变成了 3x3 或 Homography,现在等于 H=A*R’|t;好吧,我们所做的只是证明地面和传感器之间存在单应映射;

Next, note that your R|t is a 3x4 matrix and it looses one dimension since z=0; it becomes 3x3 or Homography which is equal now to H=A*R’|t; Ok all we did was proving that a homography mapping existed between the ground and your sensor;

现在,您需要在纯相机旋转和旋转/缩放前后传感器上的点之间进行缩放时发生的另一种单应性;那就是你想向下旋转相机并可能缩小.再一次,从针孔相机方程的角度考虑:最初你有 H1=A(这里我把 R|T 认为现在不相关)然后你旋转你的相机,你有 H2=AR;换句话说,H1 是您现在制作图像的方式,H2 是您希望图像的外观.
两者之间的关系是你想要找到的,H12,它也是一个单应性,因为单应性是一个家庭变换(使用这个简单的启发式:家庭中发生的事情留在家庭中).由于相同的表面可以使用 H1 或 H2 生成图像,我们可以通过取消 H1(回到地平面)并应用 H2(从地面到传感器鸟瞰图)来组装 H12;这在某种程度上类似于向量运算,您只需要遵守从从右到左的矩阵应用顺序:
H12 = H2*H1-1=A*R*A-1=P*A-1 ,我们将表达式替换为 H1、H2,最后替换为投影矩阵(以防万一)

Now, you want another kind of homography that happens during pure camera rotations and zooms between points on sensor before and after rotations/zoom; that is you want to rotate the camera down and possibly zoom out. Again, think in terms of a pinhole camera equation: originally you had H1=A ( here I threw out R|T as irrelevant for now) and then you rotated your camera and you have H2=AR; in other words, H1 is how you make your image now and H2 is how you want your image look like.
The relations between two is what you want to find, H12, and it is also a homography since the Homography is a family of transformations (use this simple heuristics: what happens in a family stays in the family). Since the same surface can generate images either with H1 or H2 we can assemble H12 by undoing H1 (back to the ground plane) and applying H2 (from the ground to a sensor bird’s eye view); in a way this resembles operations with vectors, you just have to respect the order of matrix application from the right to the left:
H12 = H2*H1-1=A*R*A-1=P*A-1 , where we substituted the expressions for H1, H2 and finally for a projection matrix (in case you do have it)

这是您的答案,如果旋转 R 未知,则可以从相机方向 w.r.t. 猜测.地面或使用 solvePnP() 来自 opeCV 库.最后,当我在手机上这样做时,我只是使用它的加速度计读数作为一个很好的近似值,因为当手机没有加速时,读数代表一个重力矢量,它给出了旋转 w.r.t.平坦的水平地面.

This is your answer, and if the rotation R is unknown it can be guessed from the camera orientation w.r.t. the ground or calculated using solvePnP() from the opeCV library. Finally, when I do this on a cell phone I just use its accelerometer readings as a good approximation since when a cell phone is not accelerated the readings represent a gravity vector which gives the rotation w.r.t. flat horizontal ground.

当您将鸟瞰图绘制为图像时,您会注意到它的边界从矩形变成了某种梯形(由于相机平截头体形状),并且在远处的位置有一些洞(由于不足采样率).您可以使用 wrapPerspective()

When you plot your bird’s eye view as an image you will notice that its boundaries turned from rectangular into some kind of a trapezoid (due to a camera frustum shape) and there are some holes at the distant locations (due to the insufficient sampling rate). You can interpolate inside the holes using wrapPerspective()

这篇关于OpenCV - 投影、单应矩阵和鸟瞰图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆