试图了解WebGL中透视矩阵背后的数学 [英] Trying to understand the math behind the perspective matrix in WebGL
问题描述
所有用于WebGL的矩阵库都具有某种perspective
函数,您可以调用该函数来获取场景的透视矩阵.
例如, <属于gl-matrix
的c2>文件的编码如下:
mat4.perspective = function (out, fovy, aspect, near, far) {
var f = 1.0 / Math.tan(fovy / 2),
nf = 1 / (near - far);
out[0] = f / aspect;
out[1] = 0;
out[2] = 0;
out[3] = 0;
out[4] = 0;
out[5] = f;
out[6] = 0;
out[7] = 0;
out[8] = 0;
out[9] = 0;
out[10] = (far + near) * nf;
out[11] = -1;
out[12] = 0;
out[13] = 0;
out[14] = (2 * far * near) * nf;
out[15] = 0;
return out;
};
我实际上是想了解这种方法中所有数学的实际作用,但是我要讲几个要点.
对于初学者来说,如果我们有一个如下的画布,其纵横比为4:3,那么该方法的aspect
参数实际上是4 / 3
,对吗?
我还注意到45°似乎是一个普通的视野.如果是这种情况,那么fovy
参数将是π / 4
弧度,对吗?
说了这么多,方法中f
变量的简称是什么,它的目的是什么?
我正在尝试设想实际情况,并且我想到了类似以下的内容:
这样思考,我可以理解为什么您将fovy
除以2
以及为什么要采用该比率的正切值,但是为什么将它的倒数存储在f
中呢?同样,我在理解f
真正代表的内容时遇到了很多麻烦.
接下来,我得到near
和far
是沿z轴的剪切点的概念,这很好,但是如果我使用上图中的数字(即π / 4
,10
和100
),然后将它们插入perspective
方法,然后得到一个类似如下的矩阵:
f
等于:
所以我还有以下问题:
- 什么是
f
? - 分配给
out[10]
的值(即110 / -90
)代表什么? - 分配给
out[11]
的-1
有什么作用? - 分配给
out[14]
的值(即2000 / -90
)代表什么?
最后,我应该注意,我已经阅读了格雷格·塔瓦雷斯(Gregg Tavares)对透视矩阵的解释,但毕竟,我仍然感到困惑.
让我们看看我是否可以对此进行解释,或者在阅读本文之后,您可以找到一种更好的解释方法.
首先要实现的是WebGL需要剪辑空间坐标.它们在x,y和z中为-1 <-> +1.因此,透视矩阵的设计基本上是为了获取 frustum 内部的空间并将其转换为剪贴空间.
如果您看这张图
我们知道切线=相邻(z)的对角(y),因此如果知道z,我们就可以计算出在给定fovY处位于平截头体边缘的y.
tan(fovY / 2) = y / -z
将两边都乘以-z
y = tan(fovY / 2) * -z
如果我们定义
f = 1 / tan(fovY / 2)
我们得到
y = -z / f
请注意,我们尚未完成从cameraspace到clipspace的转换.我们所做的只是在相机空间中给定z的情况下,在视野边缘处计算y.视场的边缘也是剪辑空间的边缘.由于剪辑空间只是+1到-1,我们只需将相机空间y除以-z / f
即可获得剪辑空间.
这有意义吗?再次查看该图.假设蓝色z
为-5,并且对于某些给定视场y
出现在+2.34
中.我们需要将+2.34
转换为+1 clipspace .通用版本是
clipY = cameraY * f/-z
查看"makePerspective"
function makePerspective(fieldOfViewInRadians, aspect, near, far) {
var f = Math.tan(Math.PI * 0.5 - 0.5 * fieldOfViewInRadians);
var rangeInv = 1.0 / (near - far);
return [
f / aspect, 0, 0, 0,
0, f, 0, 0,
0, 0, (near + far) * rangeInv, -1,
0, 0, near * far * rangeInv * 2, 0
];
};
在这种情况下,我们可以看到f
tan(Math.PI * 0.5 - 0.5 * fovY)
实际上与
相同1 / tan(fovY / 2)
为什么这样写?我在猜测,因为如果您使用第一种样式,并且tan变为0,那么您的程序将崩溃,如果您以这种方式执行操作,那么程序将崩溃,如果没有这种方式,则没有除法运算,因此没有被除以零的机会.>
看到-1
在matrix[11]
位置就意味着当我们全部完成之后
matrix[5] = tan(Math.PI * 0.5 - 0.5 * fovY)
matrix[11] = -1
clipY = cameraY * matrix[5] / cameraZ * matrix[11]
对于clipX
,我们基本上进行了完全相同的计算,只是针对宽高比进行了缩放.
matrix[0] = tan(Math.PI * 0.5 - 0.5 * fovY) / aspect
matrix[11] = -1
clipX = cameraX * matrix[0] / cameraZ * matrix[11]
最后,我们必须将-zNear<-> -zFar范围内的cameraZ转换为-1<-> + 1范围内的clipZ.
标准透视矩阵使用倒数函数来实现此功能,以便z值关闭相机可获得比远离相机的z值更高的分辨率.这个公式是
clipZ = something / cameraZ + constant
让我们将s
用作something
,将c
用作常量.
clipZ = s / cameraZ + c;
并求解s
和c
.就我们而言,我们知道
s / -zNear + c = -1
s / -zFar + c = 1
因此,将"c"移到另一侧
s / -zNear = -1 - c
s / -zFar = 1 - c
乘以-zXXX
s = (-1 - c) * -zNear
s = ( 1 - c) * -zFar
这两个东西现在彼此相等,所以
(-1 - c) * -zNear = (1 - c) * -zFar
扩大数量
(-zNear * -1) - (c * -zNear) = (1 * -zFar) - (c * -zFar)
简化
zNear + c * zNear = -zFar + c * zFar
将zNear
移至右侧
c * zNear = -zFar + c * zFar - zNear
将c * zFar
移到左侧
c * zNear - c * zFar = -zFar - zNear
简化
c * (zNear - zFar) = -(zFar + zNear)
除以(zNear - zFar)
c = -(zFar + zNear) / (zNear - zFar)
解决s
s = (1 - -((zFar + zNear) / (zNear - zFar))) * -zFar
简化
s = (1 + ((zFar + zNear) / (zNear - zFar))) * -zFar
将1
更改为(zNear - zFar)
s = ((zNear - zFar + zFar + zNear) / (zNear - zFar)) * -zFar
简化
s = ((2 * zNear) / (zNear - zFar)) * -zFar
简化一些
s = (2 * zNear * zFar) / (zNear - zFar)
当当,我希望stackexchange支持的数学就像他们的数学网站一样:(
所以回到顶部.我们的论坛是
s / cameraZ + c
我们现在知道s
和c
.
clipZ = (2 * zNear * zFar) / (zNear - zFar) / -cameraZ -
(zFar + zNear) / (zNear - zFar)
将-z移到外部
clipZ = ((2 * zNear * zFar) / zNear - ZFar) +
(zFar + zNear) / (zNear - zFar) * cameraZ) / -cameraZ
我们可以将/ (zNear - zFar)
更改为* 1 / (zNear - zFar)
这样
rangeInv = 1 / (zNear - zFar)
clipZ = ((2 * zNear * zFar) * rangeInv) +
(zFar + zNear) * rangeInv * cameraZ) / -cameraZ
回头看makeFrustum
,我们将看到它最终会制造
clipZ = (matrix[10] * cameraZ + matrix[14]) / (cameraZ * matrix[11])
看上面合适的公式
rangeInv = 1 / (zNear - zFar)
matrix[10] = (zFar + zNear) * rangeInv
matrix[14] = 2 * zNear * zFar * rangeInv
matrix[11] = -1
clipZ = (matrix[10] * cameraZ + matrix[14]) / (cameraZ * matrix[11])
我希望这是有道理的.注意:大部分只是我对mat4.js
file that's part of gl-matrix
is coded as such:
mat4.perspective = function (out, fovy, aspect, near, far) {
var f = 1.0 / Math.tan(fovy / 2),
nf = 1 / (near - far);
out[0] = f / aspect;
out[1] = 0;
out[2] = 0;
out[3] = 0;
out[4] = 0;
out[5] = f;
out[6] = 0;
out[7] = 0;
out[8] = 0;
out[9] = 0;
out[10] = (far + near) * nf;
out[11] = -1;
out[12] = 0;
out[13] = 0;
out[14] = (2 * far * near) * nf;
out[15] = 0;
return out;
};
I'm really trying to understand what all the math in this method is actually doing, but I'm tripping up on several points.
For starters, if we have a canvas as follows with an aspect ratio of 4:3, then the aspect
parameter of the method would in fact be 4 / 3
, correct?
I've also noticed that 45° seems like a common field of view. If that's the case, then the fovy
parameter would be π / 4
radians, correct?
With all that said, what is the f
variable in the method short for and what is the purpose of it?
I was trying to envision the actual scenario, and I imagined something like the following:
Thinking like this, I can understand why you divide fovy
by 2
and also why you take the tangent of that ratio, but why is the inverse of that stored in f
? Again, I'm having a lot of trouble understanding what f
really represents.
Next, I get the concept of near
and far
being the clipping points along the z-axis, so that's fine, but if I use the numbers in the picture above (i.e., π / 4
, 4 / 3
, 10
and 100
) and plug them into the perspective
method, then I end up with a matrix like the following:
Where f
is equal to:
So I'm left with the following questions:
- What is
f
? - What does the value assigned to
out[10]
(i.e.,110 / -90
) represent? - What does the
-1
assigned toout[11]
do? - What does the value assigned to
out[14]
(i.e.,2000 / -90
) represent?
Lastly, I should note that I have already read Gregg Tavares's explanation on the perspective matrix, but after all of that, I'm left with the same confusion.
Let's see if I can explain this, or maybe after reading this you can come up with a better way to explain it.
The first thing to realize is WebGL requires clipspace coordinates. They go -1 <-> +1 in x, y, and z. So, a perspective matrix is basically designed to take the space inside the frustum and convert it to clipspace.
If you look at this diagram
we know that tangent = opposite (y) over adjacent(z) so if we know z we can compute y that would be sitting at the edge of the frustum for a given fovY.
tan(fovY / 2) = y / -z
multiply both sides by -z
y = tan(fovY / 2) * -z
if we define
f = 1 / tan(fovY / 2)
we get
y = -z / f
note we haven't done a conversion from cameraspace to clipspace. All we've done is compute y at the edge of the field of view for a given z in cameraspace. The edge of the field of view is also the edge of clipspace. Since clipspace is just +1 to -1 we can just divide a cameraspace y by -z / f
to get clipspace.
Does that make sense? Look at the diagram again. Let's assume that the blue z
was -5 and for some given field of view y
came out to +2.34
. We need to convert +2.34
to +1 clipspace. The generic version of that is
clipY = cameraY * f / -z
Looking at `makePerspective'
function makePerspective(fieldOfViewInRadians, aspect, near, far) {
var f = Math.tan(Math.PI * 0.5 - 0.5 * fieldOfViewInRadians);
var rangeInv = 1.0 / (near - far);
return [
f / aspect, 0, 0, 0,
0, f, 0, 0,
0, 0, (near + far) * rangeInv, -1,
0, 0, near * far * rangeInv * 2, 0
];
};
we can see that f
in this case
tan(Math.PI * 0.5 - 0.5 * fovY)
which is actually the same as
1 / tan(fovY / 2)
Why is it written this way? I'm guessing because if you had the first style and tan came out to 0 you'd divide by 0 your program would crash where is if you do it the this way there's no division so no chance for a divide by zero.
Seeing that -1
is in matrix[11]
spot means when we're all done
matrix[5] = tan(Math.PI * 0.5 - 0.5 * fovY)
matrix[11] = -1
clipY = cameraY * matrix[5] / cameraZ * matrix[11]
For clipX
we basically do the exact same calculation except scaled for the aspect ratio.
matrix[0] = tan(Math.PI * 0.5 - 0.5 * fovY) / aspect
matrix[11] = -1
clipX = cameraX * matrix[0] / cameraZ * matrix[11]
Finally we have to convert cameraZ in the -zNear <-> -zFar range to clipZ in the -1 <-> + 1 range.
The standard perspective matrix does this with as reciprocal function so that z values close the the camera get more resolution than z values far from the camera. That formula is
clipZ = something / cameraZ + constant
Let's use s
for something
and c
for constant.
clipZ = s / cameraZ + c;
and solve for s
and c
. In our case we know
s / -zNear + c = -1
s / -zFar + c = 1
So, move the `c' to the other side
s / -zNear = -1 - c
s / -zFar = 1 - c
Multiply by -zXXX
s = (-1 - c) * -zNear
s = ( 1 - c) * -zFar
Those 2 things now equal each other so
(-1 - c) * -zNear = (1 - c) * -zFar
expand the quantities
(-zNear * -1) - (c * -zNear) = (1 * -zFar) - (c * -zFar)
simplify
zNear + c * zNear = -zFar + c * zFar
move zNear
to the right
c * zNear = -zFar + c * zFar - zNear
move c * zFar
to the left
c * zNear - c * zFar = -zFar - zNear
simplify
c * (zNear - zFar) = -(zFar + zNear)
divide by (zNear - zFar)
c = -(zFar + zNear) / (zNear - zFar)
solve for s
s = (1 - -((zFar + zNear) / (zNear - zFar))) * -zFar
simplify
s = (1 + ((zFar + zNear) / (zNear - zFar))) * -zFar
change the 1
to (zNear - zFar)
s = ((zNear - zFar + zFar + zNear) / (zNear - zFar)) * -zFar
simplify
s = ((2 * zNear) / (zNear - zFar)) * -zFar
simplify some more
s = (2 * zNear * zFar) / (zNear - zFar)
dang I wish stackexchange supported math like their math site does :(
so back to the top. Our forumla was
s / cameraZ + c
And we know s
and c
now.
clipZ = (2 * zNear * zFar) / (zNear - zFar) / -cameraZ -
(zFar + zNear) / (zNear - zFar)
let's move the -z outside
clipZ = ((2 * zNear * zFar) / zNear - ZFar) +
(zFar + zNear) / (zNear - zFar) * cameraZ) / -cameraZ
we can change / (zNear - zFar)
to * 1 / (zNear - zFar)
so
rangeInv = 1 / (zNear - zFar)
clipZ = ((2 * zNear * zFar) * rangeInv) +
(zFar + zNear) * rangeInv * cameraZ) / -cameraZ
Looking back at makeFrustum
we see it's going to end up making
clipZ = (matrix[10] * cameraZ + matrix[14]) / (cameraZ * matrix[11])
Looking at the formula above that fits
rangeInv = 1 / (zNear - zFar)
matrix[10] = (zFar + zNear) * rangeInv
matrix[14] = 2 * zNear * zFar * rangeInv
matrix[11] = -1
clipZ = (matrix[10] * cameraZ + matrix[14]) / (cameraZ * matrix[11])
I hope that made sense. Note: Most of this is just my re-writing of this article.
这篇关于试图了解WebGL中透视矩阵背后的数学的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!