Tiler Utilization统计数据在iPhone OpenGL ES仪器中意味着什么? [英] What does the Tiler Utilization statistic mean in the iPhone OpenGL ES instrument?

查看:426
本文介绍了Tiler Utilization统计数据在iPhone OpenGL ES仪器中意味着什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试执行一些OpenGL ES性能优化,试图增加我能够在iPhone应用程序中呈现的每秒三角形数量,但我已经碰到了一堵砖墙。我已经尝试将我的OpenGL ES数据类型从固定点转换为浮点数(每 Apple的推荐),交错我的顶点缓冲对象,并最大限度地减少绘图状态的变化,但这些变化都没有对渲染速度产生影响。无论如何,我似乎无法在运行3.0操作系统的iPhone 3G上将我的应用程序推到320,000三角形以上。根据此基准,我应该可以使用我正在使用的平滑着色在这个硬件上达到687,000个三角形/秒。

I have been trying to perform some OpenGL ES performance optimizations in an attempt to boost the number of triangles per second that I'm able to render in my iPhone application, but I've hit a brick wall. I've tried converting my OpenGL ES data types from fixed to floating point (per Apple's recommendation), interleaving my vertex buffer objects, and minimizing changes in drawing state, but none of these changes have made a difference in rendering speed. No matter what, I can't seem to push my application above 320,000 triangles / s on an iPhone 3G running the 3.0 OS. According to this benchmark, I should be able to hit 687,000 triangles/s on this hardware with the smooth shading I'm using.

在我的测试中,当我运行OpenGL ES性能工具时针对运行设备的仪器,我看到统计数据Tiler利用率在渲染我的基准时达到了接近100%,但渲染器利用率仅达到约30%。这可能提供了关于显示过程中瓶颈是什么的线索,但我不知道这些值是什么意思,我没有找到任何关于它们的文档。有人对iPhone OpenGL ES仪器中的这个和其他统计数据有什么好的描述吗?我知道iPhone 3G中的PowerVR MBX Lite是基于图块的延迟渲染器,但我不确定渲染器和Tiler在该体系结构中的区别。

In my testing, when I run the OpenGL ES performance tool in Instruments against the running device, I'm seeing the statistic "Tiler Utilization" reaching nearly 100% when rendering my benchmark, yet the "Renderer Utilization" is only getting to about 30%. This may be providing a clue as to what the bottleneck is in the display process, but I don't know what these values mean, and I've not found any documentation on them. Does someone have a good description of what this and the other statistics in the iPhone OpenGL ES instrument stand for? I know that the PowerVR MBX Lite in the iPhone 3G is a tile-based deferred renderer, but I'm not sure what the difference would be between the Renderer and Tiler in that architecture.

如果它有任何帮助,该应用程序的(BSD许可的)源代码可用。在当前配置中,每次加载新的分子结构并将三角形输出到控制台时,它都会启动一个小基准。

If it helps in any way, the (BSD-licensed) source code to this application is available if you want to download and test it yourself. In the current configuration, it starts a little benchmark every time you load a new molecular structure and outputs the triangles / s to the console.

推荐答案

Tiler Utilization和Renderer Utilization百分比分别测量顶点和片段处理硬件的占空比。在MBX上,Tiler利用率通常随着发送到GPU的顶点数据量(根据顶点的数量和每个顶点发送的属性的大小)进行缩放,并且碎片利用率通常随着透支和纹理采样而增加。

The Tiler Utilization and Renderer Utilization percentages measure the duty cycle of the vertex and fragment processing hardware, respectively. On the MBX, Tiler Utilization typically scales with the amount of vertex data being sent to the GPU (in terms of both the number of vertices and the size of the attributes sent per-vertex), and Fragment Utilization generally increases with overdraw and texture sampling.

在你的情况下,最好的办法是减少你发送的每个顶点的大小。对于初学者,我会尝试按颜色对原子和键进行分级,并使用恒定颜色而不是数组发送每个分区。我还建议调查短裤是否适合您的位置和法线,给予适当的缩放。在这种情况下,您可能还需要按位置进行分区,如果缩放以提供足够的精度,则不会覆盖您需要的范围。这些技术可能需要额外的绘制调用,但我怀疑顶点吞吐量的改善将超过额外的每次调用CPU开销。

In your case, the best thing would be to reduce the size of each vertex you’re sending. For starters, I’d try binning your atoms and bonds by color, and sending each of these bins using a constant color instead of an array. I’d also suggest investigating if shorts are suitable for your positions and normals, given appropriate scaling. You might also have to bin by position in this case, if shorts scaled to provide sufficient precision aren’t covering the range you need. These sorts of techniques might require additional draw calls, but I suspect the improvement in vertex throughput will outweigh the extra per-draw call CPU overhead.

请注意,它通常是有益的(在MBX和其他地方)确保每个顶点属性在32位边界上开始,这意味着如果将它们切换为短路,则应将位置和法线填充到4个组件。 MBX平台的特性也使得在这种情况下你想要在glVertexPointer的调用中实际包含该位置的W分量。

Note that it’s generally beneficial (on MBX and elsewhere) to ensure that each vertex attribute begins on a 32-bit boundary, which implies that you should pad your positions and normals out to 4 components if you switch them to shorts. The peculiarities of the MBX platform also make it such that you want to actually include the W component of the position in the call to glVertexPointer in this case.

你也可以考虑为您的多边形数据(尤其是球体)寻找替代照明方法(如DOT3),但这需要特别注意确保您不会使渲染片段受限,或者无意中发送比以前更多的顶点数据。

You might also consider pursuing alternate lighting methods like DOT3 for your polygon data, particularly the spheres, but this requires special care to make sure that you aren’t making your rendering fragment-bound, or inadvertently sending more vertex data than before.

这篇关于Tiler Utilization统计数据在iPhone OpenGL ES仪器中意味着什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆