试图了解WebGL中透视矩阵背后的数学 [英] Trying to understand the math behind the perspective matrix in WebGL

查看:86
本文介绍了试图了解WebGL中透视矩阵背后的数学的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所有用于WebGL的矩阵库都具有某种perspective函数,您可以调用该函数来获取场景的透视矩阵.
例如, <属于gl-matrix 的c2>文件的编码如下:

mat4.perspective = function (out, fovy, aspect, near, far) {
    var f = 1.0 / Math.tan(fovy / 2),
        nf = 1 / (near - far);
    out[0] = f / aspect;
    out[1] = 0;
    out[2] = 0;
    out[3] = 0;
    out[4] = 0;
    out[5] = f;
    out[6] = 0;
    out[7] = 0;
    out[8] = 0;
    out[9] = 0;
    out[10] = (far + near) * nf;
    out[11] = -1;
    out[12] = 0;
    out[13] = 0;
    out[14] = (2 * far * near) * nf;
    out[15] = 0;
    return out;
};

我实际上是想了解这种方法中所有数学的实际作用,但是我要讲几个要点.

对于初学者来说,如果我们有一个如下的画布,其纵横比为4:3,那么该方法的aspect参数实际上是4 / 3,对吗?

我还注意到45°似乎是一个普通的视野.如果是这种情况,那么fovy参数将是π / 4弧度,对吗?

说了这么多,方法中f变量的简称是什么,它的目的是什么?
我正在尝试设想实际情况,并且我想到了类似以下的内容:

这样思考,我可以理解为什么您将fovy除以2以及为什么要采用该比率的正切值,但是为什么将它的倒数存储在f中呢?同样,我在理解f真正代表的内容时遇到了很多麻烦.

接下来,我得到nearfar是沿z轴的剪切点的概念,这很好,但是如果我使用上图中的数字(即π / 410100),然后将它们插入perspective方法,然后得到一个类似如下的矩阵:

f等于:

所以我还有以下问题:

  1. 什么是f?
  2. 分配给out[10]的值(即110 / -90)代表什么?
  3. 分配给out[11]-1有什么作用?
  4. 分配给out[14]的值(即2000 / -90)代表什么?

最后,我应该注意,我已经阅读了格雷格·塔瓦雷斯(Gregg Tavares)对透视矩阵的解释,但毕竟,我仍然感到困惑.

解决方案

让我们看看我是否可以对此进行解释,或者在阅读本文之后,您可以找到一种更好的解释方法.

首先要实现的是WebGL需要剪辑空间坐标.它们在x,y和z中为-1 <-> +1.因此,透视矩阵的设计基本上是为了获取 frustum 内部的空间并将其转换为剪贴空间.

如果您看这张图

我们知道切线=相邻(z)的对角(y),因此如果知道z,我们就可以计算出在给定fovY处位于平截头体边缘的y.

tan(fovY / 2) = y / -z

将两边都乘以-z

y = tan(fovY / 2) * -z

如果我们定义

f = 1 / tan(fovY / 2)

我们得到

y = -z / f

请注意,我们尚未完成从cameraspace到clipspace的转换.我们所做的只是在相机空间中给定z的情况下,在视野边缘处计算y.视场的边缘也是剪辑空间的边缘.由于剪辑空间只是+1到-1,我们只需将相机空间y除以-z / f即可获得剪辑空间.

这有意义吗?再次查看该图.假设蓝色z为-5,并且对于某些给定视场y出现在+2.34中.我们需要将+2.34转换为+1 clipspace .通用版本是

clipY = cameraY * f/-z

查看"makePerspective"

function makePerspective(fieldOfViewInRadians, aspect, near, far) {
  var f = Math.tan(Math.PI * 0.5 - 0.5 * fieldOfViewInRadians);
  var rangeInv = 1.0 / (near - far);

  return [
    f / aspect, 0, 0, 0,
    0, f, 0, 0,
    0, 0, (near + far) * rangeInv, -1,
    0, 0, near * far * rangeInv * 2, 0
  ];
};

在这种情况下,我们可以看到f

tan(Math.PI * 0.5 - 0.5 * fovY)

实际上与

相同

1 / tan(fovY / 2)

为什么这样写?我在猜测,因为如果您使用第一种样式,并且tan变为0,那么您的程序将崩溃,如果您以这种方式执行操作,那么程序将崩溃,如果没有这种方式,则没有除法运算,因此没有被除以零的机会.

看到-1matrix[11]位置就意味着当我们全部完成之后

matrix[5]  = tan(Math.PI * 0.5 - 0.5 * fovY)
matrix[11] = -1

clipY = cameraY * matrix[5] / cameraZ * matrix[11]

对于clipX,我们基本上进行了完全相同的计算,只是针对宽高比进行了缩放.

matrix[0]  = tan(Math.PI * 0.5 - 0.5 * fovY) / aspect
matrix[11] = -1

clipX = cameraX * matrix[0] / cameraZ * matrix[11]

最后,我们必须将-zNear<-> -zFar范围内的cameraZ转换为-1<-> + 1范围内的clipZ.

标准透视矩阵使用倒数函数来实现此功能,以便z值关闭相机可获得比远离相机的z值更高的分辨率.这个公式是

clipZ = something / cameraZ + constant

让我们将s用作something,将c用作常量.

clipZ = s / cameraZ + c;

并求解sc.就我们而言,我们知道

s / -zNear + c = -1
s / -zFar  + c =  1

因此,将"c"移到另一侧

s / -zNear = -1 - c
s / -zFar  =  1 - c

乘以-zXXX

s = (-1 - c) * -zNear
s = ( 1 - c) * -zFar

这两个东西现在彼此相等,所以

(-1 - c) * -zNear = (1 - c) * -zFar

扩大数量

(-zNear * -1) - (c * -zNear) = (1 * -zFar) - (c * -zFar)

简化

zNear + c * zNear = -zFar + c * zFar

zNear移至右侧

c * zNear = -zFar + c * zFar - zNear

c * zFar移到左侧

c * zNear - c * zFar = -zFar - zNear

简化

c * (zNear - zFar) = -(zFar + zNear)

除以(zNear - zFar)

c = -(zFar + zNear) / (zNear - zFar)

解决s

s = (1 - -((zFar + zNear) / (zNear - zFar))) * -zFar

简化

s = (1 + ((zFar + zNear) / (zNear - zFar))) * -zFar

1更改为(zNear - zFar)

s = ((zNear - zFar + zFar + zNear) / (zNear - zFar)) * -zFar

简化

s = ((2 * zNear) / (zNear - zFar)) * -zFar

简化一些

s = (2 * zNear * zFar) / (zNear - zFar)

当当,我希望stackexchange支持的数学就像他们的数学网站一样:(

所以回到顶部.我们的论坛是

s / cameraZ + c

我们现在知道sc.

clipZ = (2 * zNear * zFar) / (zNear - zFar) / -cameraZ -
        (zFar + zNear) / (zNear - zFar)

将-z移到外部

clipZ = ((2 * zNear * zFar) / zNear - ZFar) +
         (zFar + zNear) / (zNear - zFar) * cameraZ) / -cameraZ

我们可以将/ (zNear - zFar)更改为* 1 / (zNear - zFar)这样

rangeInv = 1 / (zNear - zFar)
clipZ = ((2 * zNear * zFar) * rangeInv) +
         (zFar + zNear) * rangeInv * cameraZ) / -cameraZ

回头看makeFrustum,我们将看到它最终会制造

clipZ = (matrix[10] * cameraZ + matrix[14]) / (cameraZ * matrix[11])

看上面合适的公式

rangeInv = 1 / (zNear - zFar)
matrix[10] = (zFar + zNear) * rangeInv
matrix[14] = 2 * zNear * zFar * rangeInv
matrix[11] = -1
clipZ = (matrix[10] * cameraZ + matrix[14]) / (cameraZ * matrix[11])

我希望这是有道理的.注意:大部分只是我对mat4.js file that's part of gl-matrix is coded as such:

mat4.perspective = function (out, fovy, aspect, near, far) {
    var f = 1.0 / Math.tan(fovy / 2),
        nf = 1 / (near - far);
    out[0] = f / aspect;
    out[1] = 0;
    out[2] = 0;
    out[3] = 0;
    out[4] = 0;
    out[5] = f;
    out[6] = 0;
    out[7] = 0;
    out[8] = 0;
    out[9] = 0;
    out[10] = (far + near) * nf;
    out[11] = -1;
    out[12] = 0;
    out[13] = 0;
    out[14] = (2 * far * near) * nf;
    out[15] = 0;
    return out;
};

I'm really trying to understand what all the math in this method is actually doing, but I'm tripping up on several points.

For starters, if we have a canvas as follows with an aspect ratio of 4:3, then the aspect parameter of the method would in fact be 4 / 3, correct?

I've also noticed that 45° seems like a common field of view. If that's the case, then the fovy parameter would be π / 4 radians, correct?

With all that said, what is the f variable in the method short for and what is the purpose of it?
I was trying to envision the actual scenario, and I imagined something like the following:

Thinking like this, I can understand why you divide fovy by 2 and also why you take the tangent of that ratio, but why is the inverse of that stored in f? Again, I'm having a lot of trouble understanding what f really represents.

Next, I get the concept of near and far being the clipping points along the z-axis, so that's fine, but if I use the numbers in the picture above (i.e., π / 4, 4 / 3, 10 and 100) and plug them into the perspective method, then I end up with a matrix like the following:

Where f is equal to:

So I'm left with the following questions:

  1. What is f?
  2. What does the value assigned to out[10] (i.e., 110 / -90) represent?
  3. What does the -1 assigned to out[11] do?
  4. What does the value assigned to out[14] (i.e., 2000 / -90) represent?

Lastly, I should note that I have already read Gregg Tavares's explanation on the perspective matrix, but after all of that, I'm left with the same confusion.

解决方案

Let's see if I can explain this, or maybe after reading this you can come up with a better way to explain it.

The first thing to realize is WebGL requires clipspace coordinates. They go -1 <-> +1 in x, y, and z. So, a perspective matrix is basically designed to take the space inside the frustum and convert it to clipspace.

If you look at this diagram

we know that tangent = opposite (y) over adjacent(z) so if we know z we can compute y that would be sitting at the edge of the frustum for a given fovY.

tan(fovY / 2) = y / -z

multiply both sides by -z

y = tan(fovY / 2) * -z

if we define

f = 1 / tan(fovY / 2)

we get

y = -z / f

note we haven't done a conversion from cameraspace to clipspace. All we've done is compute y at the edge of the field of view for a given z in cameraspace. The edge of the field of view is also the edge of clipspace. Since clipspace is just +1 to -1 we can just divide a cameraspace y by -z / f to get clipspace.

Does that make sense? Look at the diagram again. Let's assume that the blue z was -5 and for some given field of view y came out to +2.34. We need to convert +2.34 to +1 clipspace. The generic version of that is

clipY = cameraY * f / -z

Looking at `makePerspective'

function makePerspective(fieldOfViewInRadians, aspect, near, far) {
  var f = Math.tan(Math.PI * 0.5 - 0.5 * fieldOfViewInRadians);
  var rangeInv = 1.0 / (near - far);

  return [
    f / aspect, 0, 0, 0,
    0, f, 0, 0,
    0, 0, (near + far) * rangeInv, -1,
    0, 0, near * far * rangeInv * 2, 0
  ];
};

we can see that f in this case

tan(Math.PI * 0.5 - 0.5 * fovY)

which is actually the same as

1 / tan(fovY / 2)

Why is it written this way? I'm guessing because if you had the first style and tan came out to 0 you'd divide by 0 your program would crash where is if you do it the this way there's no division so no chance for a divide by zero.

Seeing that -1 is in matrix[11] spot means when we're all done

matrix[5]  = tan(Math.PI * 0.5 - 0.5 * fovY)
matrix[11] = -1

clipY = cameraY * matrix[5] / cameraZ * matrix[11]

For clipX we basically do the exact same calculation except scaled for the aspect ratio.

matrix[0]  = tan(Math.PI * 0.5 - 0.5 * fovY) / aspect
matrix[11] = -1

clipX = cameraX * matrix[0] / cameraZ * matrix[11]

Finally we have to convert cameraZ in the -zNear <-> -zFar range to clipZ in the -1 <-> + 1 range.

The standard perspective matrix does this with as reciprocal function so that z values close the the camera get more resolution than z values far from the camera. That formula is

clipZ = something / cameraZ + constant

Let's use s for something and c for constant.

clipZ = s / cameraZ + c;

and solve for s and c. In our case we know

s / -zNear + c = -1
s / -zFar  + c =  1

So, move the `c' to the other side

s / -zNear = -1 - c
s / -zFar  =  1 - c

Multiply by -zXXX

s = (-1 - c) * -zNear
s = ( 1 - c) * -zFar

Those 2 things now equal each other so

(-1 - c) * -zNear = (1 - c) * -zFar

expand the quantities

(-zNear * -1) - (c * -zNear) = (1 * -zFar) - (c * -zFar)

simplify

zNear + c * zNear = -zFar + c * zFar

move zNear to the right

c * zNear = -zFar + c * zFar - zNear

move c * zFar to the left

c * zNear - c * zFar = -zFar - zNear

simplify

c * (zNear - zFar) = -(zFar + zNear)

divide by (zNear - zFar)

c = -(zFar + zNear) / (zNear - zFar)

solve for s

s = (1 - -((zFar + zNear) / (zNear - zFar))) * -zFar

simplify

s = (1 + ((zFar + zNear) / (zNear - zFar))) * -zFar

change the 1 to (zNear - zFar)

s = ((zNear - zFar + zFar + zNear) / (zNear - zFar)) * -zFar

simplify

s = ((2 * zNear) / (zNear - zFar)) * -zFar

simplify some more

s = (2 * zNear * zFar) / (zNear - zFar)

dang I wish stackexchange supported math like their math site does :(

so back to the top. Our forumla was

s / cameraZ + c

And we know s and c now.

clipZ = (2 * zNear * zFar) / (zNear - zFar) / -cameraZ -
        (zFar + zNear) / (zNear - zFar)

let's move the -z outside

clipZ = ((2 * zNear * zFar) / zNear - ZFar) +
         (zFar + zNear) / (zNear - zFar) * cameraZ) / -cameraZ

we can change / (zNear - zFar) to * 1 / (zNear - zFar) so

rangeInv = 1 / (zNear - zFar)
clipZ = ((2 * zNear * zFar) * rangeInv) +
         (zFar + zNear) * rangeInv * cameraZ) / -cameraZ

Looking back at makeFrustum we see it's going to end up making

clipZ = (matrix[10] * cameraZ + matrix[14]) / (cameraZ * matrix[11])

Looking at the formula above that fits

rangeInv = 1 / (zNear - zFar)
matrix[10] = (zFar + zNear) * rangeInv
matrix[14] = 2 * zNear * zFar * rangeInv
matrix[11] = -1
clipZ = (matrix[10] * cameraZ + matrix[14]) / (cameraZ * matrix[11])

I hope that made sense. Note: Most of this is just my re-writing of this article.

这篇关于试图了解WebGL中透视矩阵背后的数学的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆