iPhone OpenGL ES 工具中的 Tiler Utilization 统计数据是什么意思? [英] What does the Tiler Utilization statistic mean in the iPhone OpenGL ES instrument?

查看:13
本文介绍了iPhone OpenGL ES 工具中的 Tiler Utilization 统计数据是什么意思?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试执行一些 OpenGL ES 性能优化,以提高我在 iPhone 应用程序中每秒能够渲染的三角形数量,但我遇到了障碍.我尝试将我的 OpenGL ES 数据类型从固定转换为浮点(根据 Apple 的建议),交错我的顶点缓冲区对象,并最大限度地减少绘图状态的变化,但这些变化都没有对渲染速度产生影响.无论如何,我似乎无法在运行 3.0 操作系统的 iPhone 3G 上将我的应用程序推到 320,000 个三角形/秒以上.根据这个基准,使用我正在使用的平滑阴影,我应该能够在此硬件上达到 687,000 个三角形/秒.

I have been trying to perform some OpenGL ES performance optimizations in an attempt to boost the number of triangles per second that I'm able to render in my iPhone application, but I've hit a brick wall. I've tried converting my OpenGL ES data types from fixed to floating point (per Apple's recommendation), interleaving my vertex buffer objects, and minimizing changes in drawing state, but none of these changes have made a difference in rendering speed. No matter what, I can't seem to push my application above 320,000 triangles / s on an iPhone 3G running the 3.0 OS. According to this benchmark, I should be able to hit 687,000 triangles/s on this hardware with the smooth shading I'm using.

在我的测试中,当我在 Instruments 中针对正在运行的设备运行 OpenGL ES 性能工具时,我看到在渲染我的基准测试时统计Tiler Utilization"接近 100%,但Renderer Utilization"只是越来越到 30% 左右.这可能提供了关于显示过程中的瓶颈是什么的线索,但我不知道这些值是什么意思,也没有找到任何关于它们的文档.有人对 iPhone OpenGL ES 仪器中的这个和其他统计数据代表什么有很好的描述吗?我知道 iPhone 3G 中的 PowerVR MBX Lite 是 基于切片的延迟渲染器,但我不确定该架构中的渲染器和切片器之间有什么区别.

In my testing, when I run the OpenGL ES performance tool in Instruments against the running device, I'm seeing the statistic "Tiler Utilization" reaching nearly 100% when rendering my benchmark, yet the "Renderer Utilization" is only getting to about 30%. This may be providing a clue as to what the bottleneck is in the display process, but I don't know what these values mean, and I've not found any documentation on them. Does someone have a good description of what this and the other statistics in the iPhone OpenGL ES instrument stand for? I know that the PowerVR MBX Lite in the iPhone 3G is a tile-based deferred renderer, but I'm not sure what the difference would be between the Renderer and Tiler in that architecture.

如果有任何帮助,此应用程序的(BSD 许可)源代码 可用,如果你想自己下载和测试.在当前配置下,每次加载新的分子结构时都会启动一个小基准测试并将三角形/s 输出到控制台.

If it helps in any way, the (BSD-licensed) source code to this application is available if you want to download and test it yourself. In the current configuration, it starts a little benchmark every time you load a new molecular structure and outputs the triangles / s to the console.

推荐答案

Tiler Utilization 和 Renderer Utilization 百分比分别衡量顶点和片段处理硬件的占空比.在 MBX 上,Tiler Utilization 通常随着发送到 GPU 的顶点数据量(根据顶点数量和每个顶点发送的属性大小)而缩放,并且 Fragment Utilization 通常随着过度绘制和纹理采样而增加.

The Tiler Utilization and Renderer Utilization percentages measure the duty cycle of the vertex and fragment processing hardware, respectively. On the MBX, Tiler Utilization typically scales with the amount of vertex data being sent to the GPU (in terms of both the number of vertices and the size of the attributes sent per-vertex), and Fragment Utilization generally increases with overdraw and texture sampling.

在你的情况下,最好的办法是减少你发送的每个顶点的大小.对于初学者,我会尝试按颜色对原子和键进行分箱,并使用恒定颜色而不是数组发送这些分箱中的每一个.我还建议调查一下短裤是否适合你的位置和法线,给定适当的缩放比例.在这种情况下,您可能还必须按位置分箱,如果按比例缩放以提供足够精度的短裤无法覆盖您需要的范围.这些技术可能需要额外的绘图调用,但我怀疑顶点吞吐量的提高将超过每次绘图调用的额外 CPU 开销.

In your case, the best thing would be to reduce the size of each vertex you’re sending. For starters, I’d try binning your atoms and bonds by color, and sending each of these bins using a constant color instead of an array. I’d also suggest investigating if shorts are suitable for your positions and normals, given appropriate scaling. You might also have to bin by position in this case, if shorts scaled to provide sufficient precision aren’t covering the range you need. These sorts of techniques might require additional draw calls, but I suspect the improvement in vertex throughput will outweigh the extra per-draw call CPU overhead.

请注意,确保每个顶点属性从 32 位边界开始通常是有益的(在 MBX 和其他地方),这意味着如果将位置和法线切换为短裤,则应将它们填充到 4 个组件.MBX 平台的特殊性也使得您希望在这种情况下在调用 glVertexPointer 时实际包含位置的 W 组件.

Note that it’s generally beneficial (on MBX and elsewhere) to ensure that each vertex attribute begins on a 32-bit boundary, which implies that you should pad your positions and normals out to 4 components if you switch them to shorts. The peculiarities of the MBX platform also make it such that you want to actually include the W component of the position in the call to glVertexPointer in this case.

您还可以考虑为多边形数据(尤其是球体)采用替代照明方法(例如 DOT3),但这需要特别小心,以确保您不会使渲染片段受限,或无意中发送比以前更多的顶点数据.

You might also consider pursuing alternate lighting methods like DOT3 for your polygon data, particularly the spheres, but this requires special care to make sure that you aren’t making your rendering fragment-bound, or inadvertently sending more vertex data than before.

这篇关于iPhone OpenGL ES 工具中的 Tiler Utilization 统计数据是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆