您应该如何有效地批处理复杂的网格? [英] How should you efficiently batch complex meshes?

查看:106
本文介绍了您应该如何有效地批处理复杂的网格?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

渲染复杂网格物体的最佳方法是什么?我在下面写了不同的解决方案,想知道您对此有何看法.

What is the best way to render complex meshes? I wrote different solutions below and wonder what is your opinion about them.

让我们举个例子:如何渲染"Crytek-Sponza"网格?

Let's take an example: how to render the 'Crytek-Sponza' mesh?

PS:我不使用Ubershader,而仅使用单独的着色器

PS: I do not use Ubershader but only separate shaders

如果您通过以下链接下载网格:

If you download the mesh on the following link:

http://graphics.cs.williams.edu/data/meshes.xml

并将其加载到Blender中,您会看到整个网格由大约400个子网格组成,分别具有各自的材质/纹理.

and load it in Blender you'll see that the whole mesh is composed by about 400 sub-meshes with their own materials/textures respectively.

虚拟渲染器(版本1)将分别渲染400个子网格!这意味着(为简化情况)400绘制调用,并为每个调用绑定到材料/纹理.对性能非常不利.太慢了!

A dummy renderer (version 1) will render each of the 400 sub-mesh separately! It means (to simplify the situation) 400 draw calls with for each of them a binding to a material/texture. Very bad for performance. Very slow!

pseudo-code version_1:

foreach mesh in meshList //400 iterations :(!
 mesh->BindVBO();

  Material material = mesh->GetMaterial();
  Shader bsdf = ShaderManager::GetBSDFByMaterial(material);

  bsdf->Bind();
   bsdf->SetMaterial(material);
   bsdf->SetTexture(material->GetTexture()); //Bind texture

    mesh->Render();

现在,如果我们照顾正在加载的材料,我们可以注意到,Sponza实际上仅由25种不同的材料组成(如果我有良好的记忆力:)!

Now, if we take care of the materials being loaded we can notice that the Sponza is composed in reality of ONLY (if I have a good memory :)) 25 different materials!

因此,更明智的解决方案(版本2)应该是分批收集所有顶点/索引数据(在我们的示例中为25),而不是将VBO/IBO存储到子网格类中,而是存储到称为Batch的新类中.

So a smarter solution (version 2) should be to gather all the vertex/index data in batches (25 in our example) and not store VBO/IBO into sub-meshes classes but into a new class called Batch.

pseudo-code version_2:

foreach batch in batchList //25 iterations :)!
  batch->BindVBO();

  Material material = batch->GetMaterial();
  Shader bsdf = ShaderManager::GetBSDFByMaterial(material);

  bsdf->Bind();
   bsdf->SetMaterial(material);
   bsdf->SetTexture(material->GetTexture()); //Bind texture

    batch->Render();

在这种情况下,每个VBO包含共享完全相同的纹理/材质设置的数据!

好多了!现在我认为25 VBO渲染渲染太高了!问题在于渲染活动缓冲区的Buffer绑定的数量!我认为一个好的解决方案应该是分配一个新的VBO,如果第一个为满"(例如,假设VBO的最大大小(在VBO类中定义为属性的值)为4MB或8MB).

It's so much better! Now I think 25 VBO for render the sponza is too much! The problem is the number of Buffer bindings to render the sponza! I think a good solution should be to allocate a new VBO if the first one if 'full' (for example let's assume that the maximum size of a VBO (value defined in the VBO class as attribute) is 4MB or 8MB).

pseudo-code version_3:

foreach vbo in vboList //for example 5 VBOs (depends on the maxVBOSize)

 vbo->Bind();

 BatchList batchList = vbo->GetBatchList();

 foreach batch in batchList

  Material material = batch->GetMaterial();
  Shader bsdf = ShaderManager::GetBSDFByMaterial(material);

  bsdf->Bind();
   bsdf->SetMaterial(material);
   bsdf->SetTexture(material->GetTexture()); //Bind texture

    batch->Render();

在这种情况下,每个VBO都不包含共享完全相同的纹理/材质设置的必要数据!这取决于子网格的加载顺序!

好吧,有更少的VBO/IBO绑定,但不一定有更少的绘制调用! (通过此确认您可以吗?).但总的来说,我认为此版本3比以前的版本更好!您对此有何看法?

So OK, there are less VBO/IBO bindings but not necessary less draw calls! (are you OK by this affirmation ?). But in a general manner I think this version 3 is better than the previous one! What do you think about this ?

另一种优化应该是将sponza模型的所有纹理(或一组纹理)存储在纹理阵列中!但是,如果下载了sponza软件包,您将看到所有纹理的大小都不同!因此,我认为由于格式差异而无法将它们绑定在一起.

Another optimization should be to store all the textures (or group of textures) of the sponza model in array(s) of textures! But if you download the sponza package you will see that all texture has different sizes! So I think they can't be bound together because of their format differences.

但是如果可能的话,渲染器的版本4应该只使用较少的纹理绑定,而不是整个网格使用25个绑定!您认为有可能吗?

But if it's possible, the version 4 of the renderer should use only less texture bindings rather than 25 bindings for the whole mesh! Do you think it's possible ?

那么,据您介绍,渲染sponza网格的最佳方法是什么?您还有其他建议吗?

So, according to you, what is the best way to render the sponza mesh ? Have you another suggestion ?

推荐答案

您专注于错误的事情.有两种方式.

You are focused on the wrong things. In two ways.

首先,没有理由不能将网格的所有全部粘贴到单个缓冲区对象中.请注意,这与批处理没有任何关系.请记住:批处理大约是 draw调用的数量,而不是您使用的缓冲区的数量.您可以在同一缓冲区之外渲染400个绘图调用.

First, there's no reason you can't stick all of the mesh's vertex data into a single buffer object. Note that this has nothing to do with batching. Remember: batching is about the number of draw calls, not the number of buffers you use. You can render 400 draw calls out of the same buffer.

您似乎想拥有的这个最大尺寸"是一部小说,它是基于现实世界中的任何事物.如果需要,您可以拥有它.只是不要期望它会使您的代码更快.

This "maximum size" that you seem to want to have is a fiction, based on nothing from the real world. You can have it if you want. Just don't expect it to make your code faster.

因此,渲染此网格时,根本没有理由切换缓冲区.

So when rendering this mesh, there is no reason to be switching buffers at all.

第二,批处理并不是真正意义上的绘制调用(在OpenGL中).实际上,这是关于 绘制调用之间的状态更改的成本.

Second, batching is not really about the number of draw calls (in OpenGL). It's really about the cost of the state changes between draw calls.

此视频清楚地说明了(大约31分钟),即相对费用不同状态的变化.发出两个没有状态变化的绘图调用比较便宜(相对而言).但是,不同类型的状态更改会带来不同的代价.

This video clearly spells out (about 31 minutes in), the relative cost of different state changes. Issuing two draw calls with no state changes between them is cheap (relatively speaking). But different kinds of state changes have different costs.

更改缓冲区绑定的成本非常小(假设您使用的是单独的顶点格式 ,因此更改缓冲区并不意味着更改顶点格式).更改程序甚至纹理绑定的成本要高得多.因此,即使您必须创建多个缓冲区对象(也不必这样做),也不会成为主要瓶颈.

The cost of changing buffer bindings is quite small (assuming you're using separate vertex formats, so that changing buffers doesn't mean changing vertex formats). The cost of changing programs and even texture bindings is far greater. So even if you had to make multiple buffer objects (which again, you don't have to), that's not going to be the primary bottleneck.

因此,如果性能是您的目标,则最好将精力集中在昂贵的状态更改上,而不是廉价的状态更改上.制作一个可以处理整个网格的所有材质设置的着色器,因此您只需更改它们之间的均匀性即可.使用数组纹理,这样您只有一个纹理绑定调用.这样会将纹理绑定转换为统一的设置,这是一种便宜得多的状态更改.

So if performance is your goal, you'd be better off focusing on the expensive state changes, not the cheap ones. Making a single shader that can handle all of the material settings for the entire mesh, so that you only need to change uniforms between them. Use array textures so that you only have one texture binding call. This will turn a texture bind into a uniform setting, which is a much cheaper state change.

您甚至可以做一些更奇妙的事情,包括基本实例计数等.但这对于像这样的琐碎示例来说太过分了.

There are even fancier things you can do, involving base instance counts and the like. But that's overkill for a trivial example like this.

这篇关于您应该如何有效地批处理复杂的网格?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆