针孔摄像机模型的坐标系 [英] The coordinate system of pinhole camera model

查看:330
本文介绍了针孔摄像机模型的坐标系的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近,我一直在研究针孔相机模型,但我困惑于OpenCV提供的模型和多视图几何计算机视觉教科书。



我知道下面的照片是一个简化的模型,它切换图像平面和相机框架的位置。为了更好地说明和理解,并考虑主点(u0,v0),两个帧之间的关系为 x = f(X / Z)+ u0 code> y = f(Y / Z)+ v0 。





但是,我真的很困惑,因为通常图像坐标是第四象限坐标为下面的一个!



我可以直接用下面定义中的(x,y)代替上述等效针孔模型,这不是真正的说服力?



>



此外,如果对象在相机坐标中的区域(+ X,+ Y)象限中(当然,Z> f),在等效模型中,它应该出现在图像坐标的右半平面。然而,这种物体在由普通相机拍摄的图像中,应该位于左半部分。因此,对我来说这个模型是不合理的。



最后,我试图根据原始模型推导出以下的模型。



>



结果是 x1 = -f(X / Z) y1 = -f(Y / Z)



然后,我尝试找到(x2,y2)坐标和摄像机坐标之间的关系。
结果是 x2 = -f(X / Z)+ u0 y2 = -f(Y / Z)+ v0



在(x3,y3)坐标和摄像机坐标之间,结果是 x3 = -f / Z)+ u0 y3 = f(Y / Z)+ v0



<无论我尝试了哪一个坐标系,它们都不是 x = f(X / Z)+ u0 y =



此外,(x2,y2)上的投影结果也可以表示为f(Y / Z)+ v0 坐标或(x3,y3)坐标也是不合理的,因为相同的原因:摄像机坐标中的(+ X,+ Y,+ Z)区域中的对象应该出现在左半平面上



任何人都可以表示我误解了什么?

解决方案

我终于找出了这个问题,并证明了我的解释是正确实施的论文Z.张,通过从未知的方向查看飞机的灵活的相机校准。国际计算机视觉会议(ICCV'99),希腊科孚,第666-673页,1999年9月



让我从头开始解释一切。
以下照片是原始针孔相机型号和图像传感器上的投影结果。但是,这不是我们应该在形象上看到的。





我们应该看到的是





比较图1和图2,我们应该注意到这张照片倒置和左侧 - 右侧。
我的朋友为一个CMOS传感器公司工作告诉我,有内置的功能,以自动翻转感知的图像。



因为我们要建模图像坐标和世界坐标之间的关系,我们应该将图像传感器直接作为投影平面。以前困惑我的是投影总是局限于投影边,这误导了我在几何上理解推导。



现在,我们应该从后面图像传感器为蓝色(View Perspective)箭头。



结果如图2. x1-y1坐标现在分别向右和向下,所以方程是

  x1 = -f(X / Z)
y1 = -f(Y / Z)

现在,根据xy坐标,方程是

  x = f(X / Z)+ u0 
y = f(Y / Z)+ v0



现在,让我们来看看在现实世界中不存在但有助于视觉解释的等效模型。

p>



原理是一样的。
从投影中心向图像平面看。
结果是





其中投影的F是右侧左侧。
方程是

  x1 = f(X / Z)
y1 = f )

现在,根据xy坐标,方程是

  x = f(X / Z)+ u0 
y = f(Y / Z)+ v0



最后但并非最不重要的,因为世界上的单位坐标是mm或英寸,图像坐标中的一个是像素,有一些缩放因子,其中一些书描述为

  x = a * f(X / Z)+ u0 
y = b * f(Y / Z)+ v0


$ b b

  x = fx(X / Z)+ u0 
y = fy(Y / Z) + v0

其中 fx = a * f fy = b * f


Recently, I have been studying the pinhole camera model, but I was confused with the model provided by OpenCV and the "Multiple View geometry in computer vision" textbook.

I know that the following photo is a simplified model which switch the position of the image plane and the camera frame. For better illustration and understanding, and taking consideration of the principal point (u0,v0), the relation between two frames is x=f(X/Z)+u0 and y=f(Y/Z)+v0.

However,I was really confused because normally the image coordinate is in the form of the 4th quadrant coordinate as the following one!

Could I directly substitute the (x,y) in the following definition to the above "equivalent" pinhole model which is not really persuasive?

Besides, If an object is in the region (+X,+Y) quadrant in the camera coordinate (of course, Z>f), in the equivalent model, it should appear on the right-half plane of the image coordinate. However, such object in the image taken by a normal camera, it is supposed to be located on the left-half. Therefore, for me this model is not reasonable.

Finally, I tried to derive based on the original model as the following one.

The result is x1=-f(X/Z) and y1=-f(Y/Z).

Then, I tried to find the relation between (x2,y2)-coordinate and the camera coordinate. The result is x2=-f(X/Z)+u0 and y2=-f(Y/Z)+v0.

Between (x3,y3)-coordinate and the camera coordinate, the result is x3=-f(X/Z)+u0 and y3=f(Y/Z)+v0.

No matter which coordinate system I tried, none of them is in the form of x=f(X/Z)+u0 and y=f(Y/Z)+v0, which are provided by some CV textbooks.

Besides, the projection results on (x2,y2)-coordinate or (x3,y3)-coordinate are also not reasonable because of the same reason: an object in the (+X,+Y,+Z) region in the camera coordinate should "appear" on the left-half plane of the image taken by a camera.

Could anyone indicate what I misunderstood?

解决方案

I finally figured out this issue and proved my interpretation is right by implementing the paper by Z. Zhang, Flexible Camera Calibration By Viewing a Plane From Unknown Orientations. International Conference on Computer Vision (ICCV'99), Corfu, Greece, pages 666-673, September 1999.

Let me explain everything from scratch. The following photo is the original pinhole camera model and the projected result on the image sensor. However, this is not what we are supposed to see on the "image".

What we should see is

Comparing figure 1 and 2, we should notice that this picture is up-side-down and left-side-right. My friend who works for a CMOS sensor company told me that there are built-in functions to automatically flip the perceived image.

Since we want to model the relationship between image coordinate and the world coordinate, we should directly treat the image sensor as a projection plane. What confused me previously is the projection is always limited to the projected side and this misled me to geometrically understand the derivation.

Now, we should look from the "back" of the image sensor as the blue (View Perspective) arrow.

The result is as figure 2. the x1-y1 coordinate is now toward right and down respectively, so the equations are

x1=-f(X/Z)
y1=-f(Y/Z)

Now, in terms of the x-y coordinate, the equation is

x=f(X/Z)+u0
y=f(Y/Z)+v0

which are what the paper described.

Now, let us take a look at the equivalent model which doesn't exist in real world but helps visual interpretation.

The principle is the same. Look from the center of projection and towards the image plane. The result is

where the projected "F" is right-side-left. The equations are

x1=f(X/Z)
y1=f(Y/Z)

Now, in terms of the x-y coordinate, the equation is

x=f(X/Z)+u0
y=f(Y/Z)+v0

which are what the paper described.

Last but not least, since the unit in world coordinate is mm or inch and the one in image coordinate is pixels, there is a scaling factor where some books describe as

x=a*f(X/Z)+u0 
y=b*f(Y/Z)+v0

or

x=fx(X/Z)+u0
y=fy(Y/Z)+v0

where fx=a*f, fy=b*f

这篇关于针孔摄像机模型的坐标系的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆