GLSL:以for循环访问数组会降低性能 [英] GLSL : Accessing an array in a for-loop hinders performance

查看:712
本文介绍了GLSL:以for循环访问数组会降低性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好的,所以我正在为自己制作的游戏(使用LibGDX)开发一个android应用.而且我有一个片段着色器,我注意到我的帧速率约为41 FPS.我正在研究代码来查看问题出在哪里,并且我发现将访问数组的方式从arrayName [i]更改为arrayName [0]可以将性能提高到60 FPS,即使循环仅在其中循环一次.此特定实例.
这是代码:

Okay, so I'm developing an android app for a game I'm making (with LibGDX). And I have a fragment shader and I noticed that I had ~41 FPS. I was playing around with the code to see where the problem was, and I saw that changing how I accessed an array from arrayName[i] to arrayName[0] increased the performance back to 60 FPS, even though the loop only iterated once in this specific instance.
Here's the code :

#version 300 es

precision highp float;

uniform sampler2D u_texture;

in vec2 vTexCoord0;

struct BlackHole {
    vec2 position;
    float radius;
    float deformRadius;
};

uniform vec2 screenSize;
uniform vec3 cameraPos;
uniform float cameraZoom;

uniform BlackHole blackHole[4];
uniform int count;

out vec4 fragColor;

void main() {
    vec2 pos = vTexCoord0;

    bool doSample = true;

    for (int i = 0; i < count; i++) {
        BlackHole hole = blackHole[i];  // <-------- blackHole is the array, and changing from [i] to [0]
        vec2 position = (hole.position - cameraPos.xy) / cameraZoom + screenSize*0.5;
        float radius = hole.radius / cameraZoom;
        float deformRadius = hole.deformRadius / cameraZoom;

        vec2 deltaPos = vec2(position.x - gl_FragCoord.x, position.y - gl_FragCoord.y);
        float dist = length(deltaPos);

        if (dist <= radius) {
            fragColor = vec4(0, 0, 0, 1);
            doSample = false;
            break;
        } else if (dist <= radius + 1.0) {
            fragColor = vec4(1);
            doSample = false;
        } else if (dist <= deformRadius) {lensing
            float distToEdge = deformRadius - dist;
            pos += distToEdge * normalize(deltaPos) / screenSize;
        }
    }

    if (doSample)
        fragColor = texture(u_texture, pos);
}

在这种情况下,"count"为1.
这仅仅是GLSL的固有属性吗?还是有一些解决方法–"count"的最大值为4,所以我可以将其扩展而不使用for循环,但是我觉得那不是一个很好的解决方案.
那么,有谁知道为什么会这样和/或解决它的方法?

In this specific case, "count" is 1.
Is this just an intrinsic property of GLSL? Or is there some fix to it – the highest value of "count" would be 4, so I could expand it out and not use a for loop, but I feel like that isn't a very good solution.
So, does anyone know why this is happening and/or a way to fix it?

推荐答案

请参见 GLSL ES 3.0规范,第140页,"12.30动态索引":

See the GLSL ES 3.0 specification, page 140, "12.30 Dynamic Indexing":

对于GLSL ES 1.00,没有强制要求对数组,向量和矩阵进行动态索引,因为某些实现未直接支持它.存在针对一部分情况的软件解决方案(通过程序转换),但导致性能不佳.

For GLSL ES 1.00, support of dynamic indexing of arrays, vectors and matrices was not mandated because it was not directly supported by some implementations. Software solutions (via program transforms) exist for a subset of cases but lead to poor performance.

请注意,并非所有设备都支持OpenGL ES 3.0.目前,大约有 50%的所有Android设备支持该功能.但是,驱动程序/编译器的实际实现可能尚未优化.因此,代码的实际结果和性能可能会因设备而异.

Note that OpenGL ES 3.0 is still not supported by all devices. Around 50% of all Android devices support it at this moment. The actual implementation of the driver/compiler might not yet be as optimized though. So the actual result and performance of your code is likely to vary from device to device.

请尝试避免使用动态分支和循环(对于低于3.0的GLSL ES甚至不会编译).如果您知道循环最多执行4次,则可以使用宏来定义该值:

Try to avoid using dynamic branches and loops (it wouldn't even compile for GLSL ES below 3.0). If you know that your loop is at maximum 4 times executed, then using a macro to define that value:

#define COUNT 4
...
uniform BlackHole blackHole[COUNT];
...
    for (int i = 0; i < COUNT; i++) {

如果在编译4次后只需要使其循环2或3次,则只需在其余项中输入值,这样就可以得到好像没有这些项的结果(例如,设置半径为零).这也是 libgdx默认着色器有效.

If you then only need to have it loop 2 or 3 times while you compiled it for 4 times, then just put in values in the remaining items so that it will result as if those items weren't there (e.g. set the radius to zero). This is also how the libgdx default shader works.

在着色器中进行分支也是如此.您的代码中有很多ifelse.尝试删除那些.我还没有深入研究您的代码,但是看起来您可以将其修改为使用例如smoothstep而不是分支.

The same goes for branching in your shader. You have quite some if and else in your code. Try to remove those. I havent looked at your code in depth, but it looks like you could modify it to using e.g. a smoothstep instead of branching.

一般提示:使用着色器编辑器实时显示所编写代码的影响.例如 PowerVR着色器编辑器

A general tip: use a shader editor that shows, in realtime, the impact of the code that you write. For example the PowerVR shader editor or the Adreno shader editor. This will help you a lot.

这篇关于GLSL:以for循环访问数组会降低性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆