OpenGL 核心配置文件在 OS X 上令人难以置信的减速 [英] OpenGL core profile incredible slowdown on OS X

查看:42
本文介绍了OpenGL 核心配置文件在 OS X 上令人难以置信的减速的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在我的引擎中添加了一个新的 GL 渲染器,它使用了核心配置文件.虽然它在 Windows 和/或 nvidia 卡 上运行良好,但在 OS X 上却慢了 10 倍(3 fps 而不是 30).奇怪的是,我的兼容性配置文件渲染器运行良好.

I added a new GL renderer to my engine, which uses the core profile. While it runs fine on Windows and/or nvidia cards, it is like 10 times slower on OS X (3 fps instead of 30). The weird thing is, that my compatibility profile renderer runs fine.

我使用 Instruments 和 GL 分析器收集了一些跟踪:

I collected some traces with Instruments and the GL profiler:

https://www.dropbox.com/sh/311fg9wu0zrarzm/31CGvUcf2q

它表明应用程序将其时间花在 glDrawRangeElements 上.我尝试了以下几点:

It shows that the application spends its time in glDrawRangeElements. I tried the following things:

  • 改用 glDrawElements(无效)
  • 翻转剔除(对速度没有影响)
  • 禁用一些 GL_DYNAMIC_DRAW 缓冲区(无效)
  • 绘制时在 VAO 之后绑定索引缓冲区(无效)
  • 将索引转换为 4 字节(无效)
  • 使用 GL_BGRA 纹理(无效果)

我没有尝试将我的顶点与 16 字节边界对齐 和/或将索引转换为 4 字节,但说真的,如果那会是问题,那为什么他妈的会执行标准允许吗?

What I didn't try is to align my vertices to 16 byte boundary and/or convert indices to 4 byte, but seriously, if that would be the issue then why the hell does the standard allow it?

我正在创建这样的上下文:

I'm creating the context like this:

NSOpenGLPixelFormatAttribute attributes[] =
{
    NSOpenGLPFAColorSize, 24,
    NSOpenGLPFAAlphaSize, 8,
    NSOpenGLPFADepthSize, 24,
    NSOpenGLPFAStencilSize, 8,
    NSOpenGLPFADoubleBuffer,
    NSOpenGLPFAAccelerated,
    NSOpenGLPFANoRecovery,
    NSOpenGLPFAOpenGLProfile, NSOpenGLProfileVersion3_2Core,
    0
};

NSOpenGLPixelFormat* format = [[NSOpenGLPixelFormat alloc] initWithAttributes:attributes];
NSOpenGLContext* context = [[NSOpenGLContext alloc] initWithFormat:format shareContext:nil];

[self.view setOpenGLContext:context];
[context makeCurrentContext];

尝试了以下规格:

  • radeon 6630M,OS X 10.7.5
  • radeon 6750M,OS X 10.7.5
  • geforce GT 330M,OS X 10.8.3

你知道我可能做错了什么吗?同样,它适用于兼容性配置文件(尽管不使用 VAO).

Do you have any ideas what I might do wrong? Again, it works fine with the compatibility profile (not using VAOs though).

更新:向 Apple 报告.

UPDATE: reported to Apple.

更新:Apple 并不在乎这个问题……无论如何,我创建了一个实际上很好的小型测试程序.现在我用Instruments对比了调用栈,发现在使用引擎时,glDrawRangeElements做了两次调用:

UPDATE: Apple doesn't give a damn to the problem...anyway I created a small test program which is actually good. Now I compared the call stack with Instruments, and found out that when using the engine, glDrawRangeElements does two calls:

  • gleDrawArraysOrElements_ExecCore
  • gleDrawArraysOrElements_Entries_Body

而在测试程序中它只调用第二个.现在,第一个调用执行类似于立即模式渲染的操作(gleFlushPrimitivesTCLFunc、gleRunVertexSubmitterImmediate),因此很明显会导致速度变慢.

while in the test program it calls only the second. Now the first call does something like an immediate mode render (gleFlushPrimitivesTCLFunc, gleRunVertexSubmitterImmediate), so obviously casues the slowdown.

推荐答案

最后,我能够重现减速.这太疯狂了……这显然是由在my_Position"属性上调用 glBindAttribLocation 引起的.现在我做了一些测试:

Finally, I was able to reproduce the slowdown. This is just crazy... It is clearly caused by glBindAttribLocation being called on the "my_Position" attribute. Now I did some testing:

  • 1 是默认值(由 glGetAttribLocation 返回)
  • 如果我将它设置为零,那就没问题
  • 如果我将它设置为 1,渲染会变慢
  • 如果我将它设置为更大的数字,它又会变慢

显然我重新链接了程序(检查代码).这在实现中不是问题,我也用正常"值对其进行了测试.

Obviously I relink the program (check code). It is not a problem in the implementation, I tested it with "normal" values too.

测试程序:

https://www.dropbox.com/s/dgg48g1fwgyc5h0/SLOWDOWN_REPRO.zip

如何复制:

  • 使用 XCode 打开
  • 打开 common/gleext.h(不要被名字打扰)
  • 将 GLDECLUSAGE_POSITION 常量从 0 修改为 1
  • 编译并运行 => 慢
  • 变回零 => 好

这篇关于OpenGL 核心配置文件在 OS X 上令人难以置信的减速的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆