基本渲染 3D 透视投影到带摄像头的 2D 屏幕(不带 opengl) [英] Basic render 3D perspective projection onto 2D screen with camera (without opengl)

查看:23
本文介绍了基本渲染 3D 透视投影到带摄像头的 2D 屏幕(不带 opengl)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个如下的数据结构:

相机{双 x, y, z/** 理想情况下,摄像机角度定位为瞄准 0,0,0 点 */双角X、角Y、角Z;}SomePointIn3DSpace {双 x, y, z}屏幕数据{/** 将某个点的 3d 空间转换为 2d 空间,最终得到 x, y */int x_screenPositionOfPt, y_screenPositionOfPt双 zFar = 100;整数宽度=640,高度=480}

...

如果没有屏幕剪辑或其他任何东西,我将如何计算给定空间中某个 3d 点的某个点的屏幕 x、y 位置.我想将该 3d 点投影到 2d 屏幕上.

Camera.x = 0相机.y = 10;相机.z = -10;/** 理想情况下,我希望相机指向地面 3d 空间 0,0,0 */相机.angleX = ???;相机.angleY = ????Camera.angleZ = ????;SomePointIn3DSpace.x = 5;SomePointIn3DSpace.y = 5;SomePointIn3DSpace.z = 5;

ScreenData.x 和 y 是空间中 3d 点的屏幕 x 位置.我如何计算这些值?

我可以使用这里找到的公式,但我不明白屏幕宽度/高度是如何起作用的.另外,我在 wiki 条目中不明白观看者的位置与相机位置的关系.

.

如果您对透视投影的维基百科有困难,这是构建合适矩阵的代码,由 geeks3D 提供

void BuildPerspProjMat(float *m, float fov, float aspect,浮动 znear,浮动 zfar){浮动 xymax = znear * tan(fov * PI_OVER_360);浮动 ymin = -xymax;浮动 xmin = -xymax;浮动宽度 = xymax - xmin;浮动高度 = xymax - ymin;浮动深度 = zfar - znear;浮动 q = -(zfar + znear)/深度;浮动 qn = -2 * (zfar * znear)/深度;浮动 w = 2 * znear/宽度;w = w/方面;浮动 h = 2 * znear/高度;m[0] = w;m[1] = 0;m[2] = 0;m[3] = 0;m[4] = 0;m[5] = h;米[6] = 0;米[7] = 0;米[8] = 0;米[9] = 0;m[10] = q;m[11] = -1;米[12] = 0;m[13] = 0;m[14] = qn;米[15] = 0;}

变量是:

  • fov:视野,pi/4 弧度是一个不错的值.
  • aspect:高宽比.
  • znear、zfar:用于剪辑,我会忽略这些.

生成的矩阵是列主矩阵,在上面的代码中索引如下:

0 4 8 121 5 9 132 6 10 143 7 11 15

视口变换、屏幕坐标

这两种转换都需要另一个矩阵矩阵来将事物置于屏幕坐标中,称为视口转换.这里已经描述过了,我不会覆盖它(这很简单).

因此,对于点 p,我们会:

  • 执行模型变换矩阵 * p,得到 pm.
  • 执行投影矩阵 * pm,得到 pp.
  • 根据观看量剪辑 pp.
  • 执行视口变换矩阵 * pp,结果是 ps: 屏幕上的点.

总结

我希望能涵盖大部分内容.上面有漏洞,有的地方含糊不清,有什么问题可以在下面留言.这个主题通常值得在教科书中写一整章,我已经尽力提炼了这个过程,希望对你有利!

我在上面链接到了这个,但我强烈建议你阅读这个,并下载二进制文件.这是一个很好的工具,可以让您进一步了解这些转换以及它如何在屏幕上获得点数:

http://www.songho.ca/opengl/gl_transform.html

就实际工作而言,您需要为齐次变换实现一个 4x4 矩阵类以及一个齐次点类,您可以将其与它相乘以应用变换(记住,[x, y, z, 1]).您需要按照上述和链接中的说明生成转换.一旦你理解了程序,这并不是那么困难.祝你好运:)

Let's say I have a data structure like the following:

Camera {
   double x, y, z

   /** ideally the camera angle is positioned to aim at the 0,0,0 point */
   double angleX, angleY, angleZ;
}

SomePointIn3DSpace {
   double x, y, z
}

ScreenData {
   /** Convert from some point 3d space to 2d space, end up with x, y */
   int x_screenPositionOfPt, y_screenPositionOfPt

   double zFar = 100;

   int width=640, height=480
}

...

Without screen clipping or much of anything else, how would I calculate the screen x,y position of some point given some 3d point in space. I want to project that 3d point onto the 2d screen.

Camera.x = 0
Camera.y = 10;
Camera.z = -10;


/** ideally, I want the camera to point at the ground at 3d space 0,0,0 */
Camera.angleX = ???;
Camera.angleY = ????
Camera.angleZ = ????;

SomePointIn3DSpace.x = 5;
SomePointIn3DSpace.y = 5;
SomePointIn3DSpace.z = 5;

ScreenData.x and y is the screen x position of the 3d point in space. How do I calculate those values?

I could possibly use the equations found here, but I don't understand how the screen width/height comes into play. Also, I don't understand in the wiki entry what is the viewer's position vers the camera position.

http://en.wikipedia.org/wiki/3D_projection

解决方案

The 'way it's done' is to use homogenous transformations and coordinates. You take a point in space and:

  • Position it relative to the camera using the model matrix.
  • Project it either orthographically or in perspective using the projection matrix.
  • Apply the viewport trnasformation to place it on the screen.

This gets pretty vague, but I'll try and cover the important bits and leave some of it to you. I assume you understand the basics of matrix math :).

Homogenous Vectors, Points, Transformations

In 3D, a homogenous point would be a column matrix of the form [x, y, z, 1]. The final component is 'w', a scaling factor, which for vectors is 0: this has the effect that you can't translate vectors, which is mathematically correct. We won't go there, we're talking points.

Homogenous transformations are 4x4 matrices, used because they allow translation to be represented as a matrix multiplication, rather than an addition, which is nice and quick for your videocard. Also convenient because we can represent successive transformations by multiplying them together. We apply transformations to points by performing transformation * point.

There are 3 primary homogeneous transformations:

There are others, notably the 'look at' transformation, which are worth exploring. However, I just wanted to give a brief list and a few links. Successive application of moving, scaling and rotating applied to points is collectively the model transformation matrix, and places them in the scene, relative to the camera. It's important to realise what we're doing is akin to moving objects around the camera, not the other way around.

Orthographic and Perspective

To transform from world coordinates into screen coordinates, you would first use a projection matrix, which commonly, come in two flavors:

  • Orthographic, commonly used for 2D and CAD.
  • Perspective, good for games and 3D environments.

An orthographic projection matrix is constructed as follows:

Where parameters include:

  • Top: The Y coordinate of the top edge of visible space.
  • Bottom: The Y coordinate of the bottom edge of the visible space.
  • Left: The X coordinate of the left edge of the visible space.
  • Right: The X coordinate of the right edge of the visible space.

I think that's pretty simple. What you establish is an area of space that is going to appear on the screen, which you can clip against. It's simple here, because the area of space visible is a rectangle. Clipping in perspective is more complicated because the area which appears on screen or the viewing volume, is a frustrum.

If you're having a hard time with the wikipedia on perspective projection, Here's the code to build a suitable matrix, courtesy of geeks3D

void BuildPerspProjMat(float *m, float fov, float aspect,
float znear, float zfar)
{
  float xymax = znear * tan(fov * PI_OVER_360);
  float ymin = -xymax;
  float xmin = -xymax;

  float width = xymax - xmin;
  float height = xymax - ymin;

  float depth = zfar - znear;
  float q = -(zfar + znear) / depth;
  float qn = -2 * (zfar * znear) / depth;

  float w = 2 * znear / width;
  w = w / aspect;
  float h = 2 * znear / height;

  m[0]  = w;
  m[1]  = 0;
  m[2]  = 0;
  m[3]  = 0;

  m[4]  = 0;
  m[5]  = h;
  m[6]  = 0;
  m[7]  = 0;

  m[8]  = 0;
  m[9]  = 0;
  m[10] = q;
  m[11] = -1;

  m[12] = 0;
  m[13] = 0;
  m[14] = qn;
  m[15] = 0;
}

Variables are:

  • fov: Field of view, pi/4 radians is a good value.
  • aspect: Ratio of height to width.
  • znear, zfar: used for clipping, I'll ignore these.

and the matrix generated is column major, indexed as follows in the above code:

0   4   8  12
1   5   9  13
2   6  10  14
3   7  11  15

Viewport Transformation, Screen Coordinates

Both of these transformations require another matrix matrix to put things in screen coordinates, called the viewport transformation. That's described here, I won't cover it (it's dead simple).

Thus, for a point p, we would:

  • Perform model transformation matrix * p, resulting in pm.
  • Perform projection matrix * pm, resulting in pp.
  • Clipping pp against the viewing volume.
  • Perform viewport transformation matrix * pp, resulting is ps: point on screen.

Summary

I hope that covers most of it. There are holes in the above and it's vague in places, post any questions below. This subject is usually worthy of a whole chapter in a textbook, I've done my best to distill the process, hopefully to your advantage!

I linked to this above, but I strongly suggest you read this, and download the binary. It's an excellent tool to further your understanding of theses transformations and how it gets points on the screen:

http://www.songho.ca/opengl/gl_transform.html

As far as actual work, you'll need to implement a 4x4 matrix class for homogeneous transformations as well as a homogeneous point class you can multiply against it to apply transformations (remember, [x, y, z, 1]). You'll need to generate the transformations as described above and in the links. It's not all that difficult once you understand the procedure. Best of luck :).

这篇关于基本渲染 3D 透视投影到带摄像头的 2D 屏幕(不带 opengl)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆