WebGL 中索引和非索引几何中的顶点如何转换? [英] How are vertices transformed in WebGL in indexed and non-indexed geometries?

查看:22
本文介绍了WebGL 中索引和非索引几何中的顶点如何转换?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试消化这两个链接:

I am trying to digest these two links:

https://www.khronos.org/opengl/wiki/Rendering_Pipeline_Overviewhttps://www.khronos.org/opengl/wiki/Vertex_Shader

管道概述说顶点着色器在原始程序集之前运行.

The pipeline overview says that vertex shader runs before the primitive assembly.

第二个提到了这一点:

顶点着色器(通常)对其输入是不变的.也就是说,在单个绘图命令中,获得完全相同输入属性的两个顶点着色器调用将返回二进制相同的结果.因此,如果 OpenGL 可以检测到顶点着色器调用的输入与前一次调用相同,则允许重用前一次调用的结果,而不是浪费宝贵的时间执行它已经知道答案的事情.

A vertex shader is (usually) invariant with its input. That is, within a single Drawing Command, two vertex shader invocations that get the exact same input attributes will return binary identical results. Because of this, if OpenGL can detect that a vertex shader invocation is being given the same inputs as a previous invocation, it is allowed to reuse the results of the previous invocation, instead of wasting valuable time executing something that it already knows the answer to.

OpenGL 实现通常不会通过实际比较输入值来做到这一点(这将花费太长时间).相反,这种优化通常只在使用索引渲染函数时发生.如果一个特定的索引被多次指定(在同一个实例化渲染中),那么这个顶点肯定会产生完全相同的输入数据.

OpenGL implementations generally do not do this by actually comparing the input values (that would take far too long). Instead, this optimization typically only happens when using indexed rendering functions. If a particular index is specified more than once (within the same Instanced Rendering), then this vertex is guaranteed to result in the exact same input data.

因此,实现对顶点着色器的结果使用缓存.如果索引/实例对再次出现,并且结果仍在缓存中,则不会再次执行顶点着色器.因此,顶点着色器调用次数可以少于指定的顶点数.

Therefore, implementations employ a cache on the results of vertex shaders. If an index/instance pair comes up again, and the result is still in the cache, then the vertex shader is not executed again. Thus, there can be fewer vertex shader invocations than there are vertices specified.

所以如果我有两个四边形,每个四边形有两个三角形:

So if i have two quads with two triangles each:

索引:

verts: { 0 1 2 3 }
tris:  { 0 1 2 }
         { 1 2 3 }

汤:

verts: { 0 1 2 3 4 5 }
tris:  { 0 1 2 } 
             { 3 4 5 }

也许还有一个看起来像这样的顶点着色器:

and perhaps a vertex shader that looks like this:

uniform mat4 mvm;
uniform mat4 pm;

attribute vec3 position;

void main (){
  vec4 res;
  for ( int i = 0; i < 256; i++ ){
     res = pm * mvm * vec4(position,1.);
  }
gl_Position = res;

我应该关心一个有 4 个顶点而另一个有 6 个顶点吗? 从 gpu 到 gpu 都是如此,是否会调用顶点着色器 4 次而不是 6 次?这如何受缓存影响:

Should I care that one has 4 vertices while the other one has 6? Is this even true from gpu to gpu, will one invoke the vertex shader 4 times vs 6? How is this affected by the cache:

如果索引/实例对再次出现,结果仍在缓存中...

If an index/instance pair comes up again, and the result is still in the cache...

这里的原始数与性能有什么关系?在这两种情况下,我都有相同数量的原语.

How is the primitive number related to performance here? In both cases i have the same amount of primitives.

对于一个非常简单的片段着色器,但是一个昂贵的顶点着色器:

In the case of a very simple fragment shader, but an expensive vertex shader:

void main(){
  gl_FragColor = vec4(1.);
}

和一个镶嵌的四边形(100x100 段)我可以说索引版本运行得更快,或者可以运行得更快,或者可以说什么都不说?

And a tessellated quad (100x100 segments) can i say that the indexed version will run faster, or can run faster, or maybe say nothing?

推荐答案

就像 GPU 中符合规范的所有内容一样,您无话可说.这取决于驱动程序和 GPU.实际上,尽管在您的示例中,4 个顶点几乎在任何地方都比 6 个顶点运行得更快?

Like everything in GPUs according to the spec you can say nothing. It's up to the driver and GPU. In reality though in your example 4 vertices will run faster than 6 pretty much everywhere?

搜索顶点序优化,大量文章出现

Search for vertex order optimization and lots of articles come up

线性速度顶点缓存优化

三角顺序优化

AMD三角阶优化工具

图形硬件计算剔除的三角阶优化

无关,但规范与现实的另一个例子是,根据规范深度测试发生在片段着色器运行之后(否则您无法在片段着色器中设置 gl_FragDepth.实际上,虽然如只要结果相同,驱动程序/GPU 就可以做任何它想做的事情,所以不设置 gl_FragDepthdiscard 某些片段的片段着色器首先进行深度测试,然后才运行如果测试通过.

unrelated but another example of the spec vs realtiy is that according to the spec depth testing happens AFTER the fragment shader runs (otherwise you couldn't set gl_FragDepth in the fragment shader. In reality though as long as the results are the same the driver/GPU can do whatever it wants so fragment shaders that don't set gl_FragDepth or discard certain fragments are depth tested first and only run if the test passes.

这篇关于WebGL 中索引和非索引几何中的顶点如何转换?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆