试图了解WebGL中透视矩阵背后的数学 [英] Trying to understand the math behind the perspective matrix in WebGL

查看：86 发布时间：2020/5/6 10:33:58 math matrix opengl-es webgl perspectivecamera

本文介绍了试图了解WebGL中透视矩阵背后的数学的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

所有用于WebGL的矩阵库都具有某种perspective函数，您可以调用该函数来获取场景的透视矩阵.
例如， <属于gl-matrix 的c2>文件的编码如下:

mat4.perspective = function (out, fovy, aspect, near, far) {
    var f = 1.0 / Math.tan(fovy / 2),
        nf = 1 / (near - far);
    out[0] = f / aspect;
    out[1] = 0;
    out[2] = 0;
    out[3] = 0;
    out[4] = 0;
    out[5] = f;
    out[6] = 0;
    out[7] = 0;
    out[8] = 0;
    out[9] = 0;
    out[10] = (far + near) * nf;
    out[11] = -1;
    out[12] = 0;
    out[13] = 0;
    out[14] = (2 * far * near) * nf;
    out[15] = 0;
    return out;
};

我实际上是想了解这种方法中所有数学的实际作用，但是我要讲几个要点.

对于初学者来说，如果我们有一个如下的画布，其纵横比为4:3，那么该方法的aspect参数实际上是4 / 3，对吗?

我还注意到45°似乎是一个普通的视野.如果是这种情况，那么fovy参数将是π / 4弧度，对吗?

说了这么多，方法中f变量的简称是什么，它的目的是什么?
我正在尝试设想实际情况，并且我想到了类似以下的内容:

这样思考，我可以理解为什么您将fovy除以2以及为什么要采用该比率的正切值，但是为什么将它的倒数存储在f中呢?同样，我在理解f真正代表的内容时遇到了很多麻烦.

接下来，我得到near和far是沿z轴的剪切点的概念，这很好，但是如果我使用上图中的数字(即π / 4，，10和100)，然后将它们插入perspective方法，然后得到一个类似如下的矩阵:

f等于:

所以我还有以下问题:

什么是f?
分配给out[10]的值(即110 / -90)代表什么?
分配给out[11]的-1有什么作用?
分配给out[14]的值(即2000 / -90)代表什么?

最后，我应该注意，我已经阅读了格雷格·塔瓦雷斯(Gregg Tavares)对透视矩阵的解释，但毕竟，我仍然感到困惑.

解决方案

让我们看看我是否可以对此进行解释，或者在阅读本文之后，您可以找到一种更好的解释方法.

首先要实现的是WebGL需要剪辑空间坐标.它们在x，y和z中为-1 <-> +1.因此，透视矩阵的设计基本上是为了获取 frustum 内部的空间并将其转换为剪贴空间.

如果您看这张图

我们知道切线=相邻(z)的对角(y)，因此如果知道z，我们就可以计算出在给定fovY处位于平截头体边缘的y.

tan(fovY / 2) = y / -z

将两边都乘以-z

y = tan(fovY / 2) * -z

如果我们定义

f = 1 / tan(fovY / 2)

我们得到

y = -z / f

请注意，我们尚未完成从cameraspace到clipspace的转换.我们所做的只是在相机空间中给定z的情况下，在视野边缘处计算y.视场的边缘也是剪辑空间的边缘.由于剪辑空间只是+1到-1，我们只需将相机空间y除以-z / f即可获得剪辑空间.

这有意义吗?再次查看该图.假设蓝色z为-5，并且对于某些给定视场y出现在+2.34中.我们需要将+2.34转换为+1 clipspace .通用版本是

clipY = cameraY * f/-z

查看"makePerspective"

function makePerspective(fieldOfViewInRadians, aspect, near, far) {
  var f = Math.tan(Math.PI * 0.5 - 0.5 * fieldOfViewInRadians);
  var rangeInv = 1.0 / (near - far);

  return [
    f / aspect, 0, 0, 0,
    0, f, 0, 0,
    0, 0, (near + far) * rangeInv, -1,
    0, 0, near * far * rangeInv * 2, 0
  ];
};

在这种情况下，我们可以看到f

tan(Math.PI * 0.5 - 0.5 * fovY)

实际上与

相同

1 / tan(fovY / 2)

为什么这样写?我在猜测，因为如果您使用第一种样式，并且tan变为0，那么您的程序将崩溃，如果您以这种方式执行操作，那么程序将崩溃，如果没有这种方式，则没有除法运算，因此没有被除以零的机会.

看到-1在matrix[11]位置就意味着当我们全部完成之后

matrix[5]  = tan(Math.PI * 0.5 - 0.5 * fovY)
matrix[11] = -1

clipY = cameraY * matrix[5] / cameraZ * matrix[11]

对于clipX，我们基本上进行了完全相同的计算，只是针对宽高比进行了缩放.

matrix[0]  = tan(Math.PI * 0.5 - 0.5 * fovY) / aspect
matrix[11] = -1

clipX = cameraX * matrix[0] / cameraZ * matrix[11]

最后，我们必须将-zNear<-> -zFar范围内的cameraZ转换为-1<-> + 1范围内的clipZ.

标准透视矩阵使用倒数函数来实现此功能，以便z值关闭相机可获得比远离相机的z值更高的分辨率.这个公式是

clipZ = something / cameraZ + constant

让我们将s用作something，将c用作常量.

clipZ = s / cameraZ + c;

并求解s和c.就我们而言，我们知道

s / -zNear + c = -1
s / -zFar  + c =  1

因此，将"c"移到另一侧

s / -zNear = -1 - c
s / -zFar  =  1 - c

乘以-zXXX

s = (-1 - c) * -zNear
s = ( 1 - c) * -zFar

这两个东西现在彼此相等，所以

(-1 - c) * -zNear = (1 - c) * -zFar

扩大数量

(-zNear * -1) - (c * -zNear) = (1 * -zFar) - (c * -zFar)

简化

zNear + c * zNear = -zFar + c * zFar

将zNear移至右侧

c * zNear = -zFar + c * zFar - zNear

将c * zFar移到左侧

c * zNear - c * zFar = -zFar - zNear

简化

c * (zNear - zFar) = -(zFar + zNear)

除以(zNear - zFar)

c = -(zFar + zNear) / (zNear - zFar)

解决s

s = (1 - -((zFar + zNear) / (zNear - zFar))) * -zFar

简化

s = (1 + ((zFar + zNear) / (zNear - zFar))) * -zFar

将1更改为(zNear - zFar)

s = ((zNear - zFar + zFar + zNear) / (zNear - zFar)) * -zFar

简化

s = ((2 * zNear) / (zNear - zFar)) * -zFar

简化一些

s = (2 * zNear * zFar) / (zNear - zFar)

当当，我希望stackexchange支持的数学就像他们的数学网站一样:(

所以回到顶部.我们的论坛是

s / cameraZ + c

我们现在知道s和c.

clipZ = (2 * zNear * zFar) / (zNear - zFar) / -cameraZ -
        (zFar + zNear) / (zNear - zFar)

将-z移到外部

clipZ = ((2 * zNear * zFar) / zNear - ZFar) +
         (zFar + zNear) / (zNear - zFar) * cameraZ) / -cameraZ

我们可以将/ (zNear - zFar)更改为* 1 / (zNear - zFar)这样

rangeInv = 1 / (zNear - zFar)
clipZ = ((2 * zNear * zFar) * rangeInv) +
         (zFar + zNear) * rangeInv * cameraZ) / -cameraZ

回头看makeFrustum，我们将看到它最终会制造

clipZ = (matrix[10] * cameraZ + matrix[14]) / (cameraZ * matrix[11])

看上面合适的公式

rangeInv = 1 / (zNear - zFar)
matrix[10] = (zFar + zNear) * rangeInv
matrix[14] = 2 * zNear * zFar * rangeInv
matrix[11] = -1
clipZ = (matrix[10] * cameraZ + matrix[14]) / (cameraZ * matrix[11])

我希望这是有道理的.注意:大部分只是我对mat4.js file that's part of gl-matrix is coded as such:

mat4.perspective = function (out, fovy, aspect, near, far) {
    var f = 1.0 / Math.tan(fovy / 2),
        nf = 1 / (near - far);
    out[0] = f / aspect;
    out[1] = 0;
    out[2] = 0;
    out[3] = 0;
    out[4] = 0;
    out[5] = f;
    out[6] = 0;
    out[7] = 0;
    out[8] = 0;
    out[9] = 0;
    out[10] = (far + near) * nf;
    out[11] = -1;
    out[12] = 0;
    out[13] = 0;
    out[14] = (2 * far * near) * nf;
    out[15] = 0;
    return out;
};

I'm really trying to understand what all the math in this method is actually doing, but I'm tripping up on several points.

For starters, if we have a canvas as follows with an aspect ratio of 4:3, then the aspect parameter of the method would in fact be 4 / 3, correct?

I've also noticed that 45° seems like a common field of view. If that's the case, then the fovy parameter would be π / 4 radians, correct?

With all that said, what is the f variable in the method short for and what is the purpose of it?
I was trying to envision the actual scenario, and I imagined something like the following:

Thinking like this, I can understand why you divide fovy by 2 and also why you take the tangent of that ratio, but why is the inverse of that stored in f? Again, I'm having a lot of trouble understanding what f really represents.

Next, I get the concept of near and far being the clipping points along the z-axis, so that's fine, but if I use the numbers in the picture above (i.e., π / 4, 4 / 3, 10 and 100) and plug them into the perspective method, then I end up with a matrix like the following:

Where f is equal to:

So I'm left with the following questions:

What is f?
What does the value assigned to out[10] (i.e., 110 / -90) represent?
What does the -1 assigned to out[11] do?
What does the value assigned to out[14] (i.e., 2000 / -90) represent?

Lastly, I should note that I have already read Gregg Tavares's explanation on the perspective matrix, but after all of that, I'm left with the same confusion.

解决方案

Let's see if I can explain this, or maybe after reading this you can come up with a better way to explain it.

The first thing to realize is WebGL requires clipspace coordinates. They go -1 <-> +1 in x, y, and z. So, a perspective matrix is basically designed to take the space inside the frustum and convert it to clipspace.

If you look at this diagram

we know that tangent = opposite (y) over adjacent(z) so if we know z we can compute y that would be sitting at the edge of the frustum for a given fovY.

tan(fovY / 2) = y / -z

multiply both sides by -z

y = tan(fovY / 2) * -z

if we define

f = 1 / tan(fovY / 2)

we get

y = -z / f

note we haven't done a conversion from cameraspace to clipspace. All we've done is compute y at the edge of the field of view for a given z in cameraspace. The edge of the field of view is also the edge of clipspace. Since clipspace is just +1 to -1 we can just divide a cameraspace y by -z / f to get clipspace.

Does that make sense? Look at the diagram again. Let's assume that the blue z was -5 and for some given field of view y came out to +2.34. We need to convert +2.34 to +1 clipspace. The generic version of that is

clipY = cameraY * f / -z

Looking at `makePerspective'

function makePerspective(fieldOfViewInRadians, aspect, near, far) {
  var f = Math.tan(Math.PI * 0.5 - 0.5 * fieldOfViewInRadians);
  var rangeInv = 1.0 / (near - far);

  return [
    f / aspect, 0, 0, 0,
    0, f, 0, 0,
    0, 0, (near + far) * rangeInv, -1,
    0, 0, near * far * rangeInv * 2, 0
  ];
};

we can see that f in this case

tan(Math.PI * 0.5 - 0.5 * fovY)

which is actually the same as

1 / tan(fovY / 2)

Why is it written this way? I'm guessing because if you had the first style and tan came out to 0 you'd divide by 0 your program would crash where is if you do it the this way there's no division so no chance for a divide by zero.

Seeing that -1 is in matrix[11] spot means when we're all done

matrix[5]  = tan(Math.PI * 0.5 - 0.5 * fovY)
matrix[11] = -1

clipY = cameraY * matrix[5] / cameraZ * matrix[11]

For clipX we basically do the exact same calculation except scaled for the aspect ratio.

matrix[0]  = tan(Math.PI * 0.5 - 0.5 * fovY) / aspect
matrix[11] = -1

clipX = cameraX * matrix[0] / cameraZ * matrix[11]

Finally we have to convert cameraZ in the -zNear <-> -zFar range to clipZ in the -1 <-> + 1 range.

The standard perspective matrix does this with as reciprocal function so that z values close the the camera get more resolution than z values far from the camera. That formula is

clipZ = something / cameraZ + constant

Let's use s for something and c for constant.

clipZ = s / cameraZ + c;

and solve for s and c. In our case we know

s / -zNear + c = -1
s / -zFar  + c =  1

So, move the `c' to the other side

s / -zNear = -1 - c
s / -zFar  =  1 - c

Multiply by -zXXX

s = (-1 - c) * -zNear
s = ( 1 - c) * -zFar

Those 2 things now equal each other so

(-1 - c) * -zNear = (1 - c) * -zFar

expand the quantities

(-zNear * -1) - (c * -zNear) = (1 * -zFar) - (c * -zFar)

simplify

zNear + c * zNear = -zFar + c * zFar

move zNear to the right

c * zNear = -zFar + c * zFar - zNear

move c * zFar to the left

c * zNear - c * zFar = -zFar - zNear

simplify

c * (zNear - zFar) = -(zFar + zNear)

divide by (zNear - zFar)

c = -(zFar + zNear) / (zNear - zFar)

solve for s

s = (1 - -((zFar + zNear) / (zNear - zFar))) * -zFar

simplify

s = (1 + ((zFar + zNear) / (zNear - zFar))) * -zFar

change the 1 to (zNear - zFar)

s = ((zNear - zFar + zFar + zNear) / (zNear - zFar)) * -zFar

simplify

s = ((2 * zNear) / (zNear - zFar)) * -zFar

simplify some more

s = (2 * zNear * zFar) / (zNear - zFar)

dang I wish stackexchange supported math like their math site does :(

so back to the top. Our forumla was

s / cameraZ + c

And we know s and c now.

clipZ = (2 * zNear * zFar) / (zNear - zFar) / -cameraZ -
        (zFar + zNear) / (zNear - zFar)

let's move the -z outside

clipZ = ((2 * zNear * zFar) / zNear - ZFar) +
         (zFar + zNear) / (zNear - zFar) * cameraZ) / -cameraZ

we can change / (zNear - zFar) to * 1 / (zNear - zFar) so

rangeInv = 1 / (zNear - zFar)
clipZ = ((2 * zNear * zFar) * rangeInv) +
         (zFar + zNear) * rangeInv * cameraZ) / -cameraZ

Looking back at makeFrustum we see it's going to end up making

clipZ = (matrix[10] * cameraZ + matrix[14]) / (cameraZ * matrix[11])

Looking at the formula above that fits

rangeInv = 1 / (zNear - zFar)
matrix[10] = (zFar + zNear) * rangeInv
matrix[14] = 2 * zNear * zFar * rangeInv
matrix[11] = -1
clipZ = (matrix[10] * cameraZ + matrix[14]) / (cameraZ * matrix[11])

I hope that made sense. Note: Most of this is just my re-writing of this article.

这篇关于试图了解WebGL中透视矩阵背后的数学的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

试图了解WebGL中透视矩阵背后的数学 [英] Trying to understand the math behind the perspective matrix in WebGL

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

试图了解WebGL中透视矩阵背后的数学 [英] Trying to understand the math behind the perspective matrix in WebGL

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭