GLSL分支行为 [英] GLSL branching behaviour

查看:193
本文介绍了GLSL分支行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个相当简单的带有分支的片段着色器,我有点不确定GLSL编译器如何处理它,以及它将如何影响性能.

I have a rather simple fragment shader with a branch and I'm a bit unsure how it is handled by the GLSL compiler and how it would affect performance.

uniform sampler2D sampler;
uniform vec2 texSize;
uniform vec2 targetSize; 

void main()               
{                  
    vec4 color;
    if(texSize == targetSize)
        color = texture2DNearest(sampler, gl_TexCoord[0]);
    else
        color = texture2DBicubic(sampler, gl_TexCoord[0]);
    gl_FragColor = color;        
}

我从 AMDs文档中阅读到有时两个分支都被执行,在这种情况下这不是一个好主意.如果没有更多信息或无法拆卸,我不确定该怎么考虑,以及在出现问题时如何避免这种情况?

I have read from an AMDs document that sometimes both branches are executed, which would not be a good idea in this case. Without further information nor access to disassembly I'm unsure what to think about this, and how to avoid it if it is a problem?

而且根据我的理解,基于统一变量的分支不会产生任何重大开销,因为它在单次传递中是恒定的?

And also from my understanding a branch based on a uniform variable will not incur any significant overhead since it is constant over a single pass?

推荐答案

在这里,您拥有它:

il_ps_2_0
dcl_input_generic_interp(linear) v1
dcl_resource_id(0)_type(2d)_fmtx(float)_fmty(float)_fmtz(float)_fmtw(float)
eq r2.xy__, c1.xyyy, c0.xyyy
imul r5.x___, r2.x, r2.y
mov r1.x___, r5.x
if_logicalnz r1.x
    sample_resource(0)_sampler(0) r6, v1.xyyy
    mov r7, r6
else
    sample_resource(0)_sampler(0) r8, v1.xyyy
    mov r7, r8
endif
mov r9, r7
mov oC0, r9
endmain

要重新解释一下Kos所说的话,重要的是要知道执行前是否知道保护条件.出现这种情况是因为c1c0寄存器是常数(常数寄存器以字母'c'开头),r1.x寄存器值也是常数.

To rephrase a bit what Kos said, what matters is to know if the guard condition can be known before execution. This is the case here since c1 and c0 registers are constant (constant registers start with letter 'c') and so is r1.x register value.

这意味着所有调用的片段着色器的此值都相同,因此不会发生线程分歧.

That means this value is the same for all invocated fragment shaders, therefore no thread divergence can happen.

顺便说一句,我正在使用 AMD GPU ShaderAnalyser 用于将GLSL转换为IL. 您还可以为特定的一代生成本机GPU汇编代码(范围从HD29xx到HD58xx).这确实是一个很好的工具!

Btw, I'm using AMD GPU ShaderAnalyser for transforming GLSL into the IL. You can also generate native GPU assembly code for a specific generation (ranging from HD29xx to HD58xx).This is really a good tool!

这篇关于GLSL分支行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆