OpenGL状态冗余消除树,呈现状态优先级 [英] OpenGL state redundancy elimination Tree, render state priorities

查看:117
本文介绍了OpenGL状态冗余消除树,呈现状态优先级的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在我的游戏引擎中研究一种自动OpenGL批处理方法,以减少绘图调用和冗余调用.

I am working on a Automatic OpenGL batching method in my Game Engine, to reduce draw calls and redundant calls.

我的批处理树设计从最昂贵的状态开始,并为每个较便宜的状态添加叶子.

My batch tree design begins with the most expensive states and adds leafs down for each less expensive state.

示例: 树根:着色器/程序 兄弟姐妹:混合状态...

Example: Tree Root: Shaders / Programs Siblings: Blend states ... a.s.o.

所以我的问题是,在此列表中,最昂贵的电话是最有可能的:

So my question is what are most likely the most expensive calls, in this list:

  • 绑定程序
  • 绑定纹理
  • 绑定缓冲区
  • 缓冲纹理,顶点数据
  • 绑定渲染目标
  • glEnable/glDisable
  • 混合状态方程,颜色,函数,colorWriteMask
  • 深度模板状态depthFunc,stencilOperations,stencilFunction,writeMasks

还想知道哪种方法会更快:
-将所有可批处理的绘制命令收集到单个顶点缓冲区中,并且仅调用1个绘制调用(此方法还将强制在cpu侧更新每个顶点的矩阵变换)
-根本不进行批处理并渲染许多小的绘制调用,仅批处理粒子系统...

Also wondering which method will be faster:
- Collect all batchable draw commands to single vertex buffer and call only 1 draw call (this method would also force to update matrix transforms per vertex on cpu side)
- Do not batch at all and render many small draw calls, only batch particle system ...

PS:渲染目标"将始终根据使用情况而更改前或后.

PS: Render Targets will always Pre or Post changed, depending on usage.

到目前为止的进展:

Progress so far:

  • 安东·科尔曼(Andon M.Coleman):最便宜的制服和顶点阵列绑定,昂贵的FBO,纹理绑定
  • datenwolf:程序会使状态缓存无效

1:帧缓冲状态
2:程序
3:纹理绑定
...
N:顶点数组绑定,均匀绑定

1: Framebuffer states
2: Program
3: Texture Binding
...
N: Vertex Array binding, Uniform binding

WebGL中的当前执行树:

Current execution Tree in WebGL:

  • 程序
  • 属性指针
  • 纹理
  • 混合状态
  • 深度状态
  • 模具正面/背面状态
  • 光栅器状态
  • 采样器状态
  • 绑定缓冲区
  • 绘制数组
  • Program
  • Attribute Pointers
  • Texture
  • Blend State
  • Depth State
  • Stencil Front / Back State
  • Rasterizer State
  • Sampler State
  • Bind Buffer
  • Draw Arrays

每个步骤都是同级哈希树,以避免再次检查主渲染队列内部的状态缓存

Each step is a sibling hash tree, to avoid checking agains state cache inside of main render queue

加载纹理/程序/着色器/缓冲区是在渲染到额外队列之前进行的,以便将来进行多线程处理,并确保在对上下文进行任何操作之前先对其进行初始化.

Loading Textures / Programs / Shaders / Buffers happens before rendering in an extra queue, for future multi threading and also to be sure that the context is initialized before doing anything with it.

自渲染对象的最大问题是您无法控制什么时候发生,例如,如果开发人员在初始化gl之前调用了这些方法,他将不知道为什么,但是会遇到一些错误或问题...

The biggest problem of self rendering objects is that you cannot control when something happens, for example if a developer calls these methods before gl is initialized, he wouldn't know why but he would have some bugs or problems...

推荐答案

此类操作的相对成本当然取决于使用模式和您的一般情况.但是您可能会找到 Nvidia的"Beoynd Porting"演示幻灯片作为有用的指南.让我在这里特别复制幻灯片48:

The relative costs of such operations will of course depend on the usage pattern and your general scenario. But you might find Nvidia's "Beoynd Porting" presentation slides as a useful guide. Let me reproduce especially slide 48 here:

状态变更的相对成本

Relative Cost of state changes

  • 降低成本...
  • 渲染目标〜60K/s
  • 程序〜300K/s
  • ROP
  • 纹理绑定〜1.5M/s
  • 顶点格式
  • UBO绑定
  • 统一更新〜10M/s
  • In decreasing cost...
  • Render Target ~60K/s
  • Program ~300K/s
  • ROP
  • Texture Bindings ~1.5M/s
  • Vertex Format
  • UBO Bindings
  • Uniform Updates ~10M/s

这与列表中的所有项目符号不直接匹配.例如. glEnable/glDisable可能会影响任何内容. GL的缓冲区绑定也不能直接被GPU看到.当然,缓冲区绑定主要是客户端状态,具体取决于目标.混合状态的更改将是ROP状态更改,依此类推.

This does not directly match all of the bullet points of your list. E.g. glEnable/glDisable might affect anything. Also GL's buffer bindings are nothing the GPU directly sees. Buffer bindings are mainly a client side state, depending on the target, of course. Change of blending state would be a ROP state change, and so on.

这篇关于OpenGL状态冗余消除树,呈现状态优先级的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆