OpenGL GLSL统一分支与多个着色器 [英] OpenGL GLSL uniform branching vs. Multiple shaders

查看:503
本文介绍了OpenGL GLSL统一分支与多个着色器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经阅读了许多有关统一if语句的文章,这些语句处理分支以更改大型着色器超级着色器"的行为.我开始使用uber着色器(opengl lwjgl),但是后来我意识到,简单的操作是在片段着色器中添加一个由统一变量设置的if语句,与没有统一的if语句的单独着色器相比,该简单的计算将我的fps降低了5.我没有将任何上限设置为我的fps限制,它只是尽可能快地刷新.我将添加法线映射和parrallax映射,我可以看到两条路线:

I've been reading many articles on uniform if statements that deal with branching to change the behavior of large shaders "uber shaders". I started on an uber shader (opengl lwjgl) but then I realized, the simple act of adding an if statement set by a uniform in the fragment shader that does simple calculations decreased my fps by 5 compared to seperate shaders without uniform if statements. I haven't set any cap to my fps limit, it's just refreshing as fast as possible. I'm about to add normal mapping and parrallax mapping and I can see two routes:

Uber顶点着色器:

Uber vertex shader:

#version 400 core

layout(location = 0) in vec3 position;
layout(location = 1) in vec2 textureCoords;
layout(location = 2)in vec3 normal;
**UNIFORM float RenderFlag;** 


void main(void){

if(RenderFlag ==0){
 //Calculate outVariables for normal mapping to the fragment shader
}

if(RenderFlag ==1){
//Calcuate outVariables for parallax mapping to the fragment shader
}

gl_Position = MVPmatrix *vec4(position,1);



}

Uber片段着色器:

Uber fragment shader:

layout(location = 0) in vec3 position;
layout(location = 1) in vec2 textureCoords;
layout(location = 2)in vec3 normal;
**UNIFORM float RenderFlag;** 
**UNIFORM float reflectionFlag;** // if set either of the 2 render modes               
will have some reflection of the skybox added to it, like reflective   
surface.

void main(void){
if(RenderFlag ==0){
  //display normal mapping


  if(reflectionFlag){
     vec4 reflectColor = texture(cube_texture, ReflectDirR) ;
     //add reflection color to final color and output

  }

}
if(RenderFlag ==1){
//display parrallax mapping
if(reflectionFlag){
    vec4 reflectColor = texture(cube_texture, ReflectDirR) ;

   //add reflection color to final color and output
   }
}
gl_Position = MVPmatrix *vec4(position,1);



}

(对我而言)这样做的好处是流程简单,但是会使整个程序更复杂,并且我面临着丑陋的嵌套if语句.另外,如果我想完全避免if语句,我将需要4个单独的着色器,一个用于处理每个可能的分支(正常不带反射:带反射的正常:Parrallax不带反射:带反射的Parrallax)仅用于一种功能,即反射.

The benefit of this (for me) is simplicity in the flow, but makes the overall program more complex and i'm faced with ugly nested if statements. Also if I wanted to completely avoid if statements I would need 4 seperate shaders, one to handle each possible branch (Normal w/o reflection : Normal with reflection : Parrallax w/o reflection : Parrallax with reflection) just for one feature, reflection.

1:GLSL是否同时执行分支和后续分支,并计算BOTH函数,然后输出正确的分支?

1: Does GLSL execute both branches and subsequent branches and calculates BOTH functions then outputs the correct one?

2:我应该删除if语句,而不是使用统一的反射标志,而无需考虑计算反射颜色,如果它是一个相对较小的操作(如类似操作),则将其添加到最终颜色中

2: Instead of a uniform flag for the reflection should I remove the if statement in favor of calculating the reflection color irregardless and adding it to the final color if it is a relatively small operation with something like

finalColor = finalColor + reflectionColor * X 
where X = a uniform variable, if none X == 0, if Reflection X==some amount.

推荐答案

马上说,让我指出GL4已经添加了子例程,它们是您所讨论的两者的结合.但是,除非您使用单个基本着色器的大量置换,并且在一个帧中将其替换掉多次(就像您在正向渲染引擎中有一些动态材质系统那样),否则子例程确实不是性能上的胜利.我在自己的工作中花了一些时间和精力,在一种特定的硬件/驱动程序组合上得到了值得的改进,而在大多数其他硬件/驱动程序组合上却没有明显的变化(好还是坏).

Right off the bat, let me point out that GL4 has added subroutines, which are sort of a combination of both things you discussed. However, unless you are using a massive number of permutations of a single basic shader that gets swapped out multiple times during a frame (as you might if you had some dynamic material system in a forward rendering engine), subroutines really are not a performance win. I've put some time and effort into this in my own work and I get worthwhile improvements on one particular hardware/driver combination, and no appreciable change (good or bad) on most others.

为什么要调出子程序?主要是因为您正在讨论什么才是微优化,而子例程是一个很好的例子,说明了为什么直到开发结束才花费大量时间来考虑这一点才值得.如果您在努力达到一些性能指标而已超出了所有高级优化策略的范围,那么您可以担心这些问题.

Why did I bring up subroutines? Mostly because you're discussing what amounts to micro optimization, and subroutines are a really good example of why it doesn't pay to invest a whole lot of time thinking about that until the very end of development. If you're struggling to meet some performance number and you've crossed every high-level optimization strategy off the list, then you can worry about this stuff.

也就是说,几乎不可能回答GLSL如何执行着色器.这只是一种高级语言;自创建GLSL以来,底层硬件体系结构已发生了数次更改.最新一代的硬件具有实际的分支谓词和一些非常复杂的线程引擎,这是GLSL 1.10类硬件所没有的,其中一些实际上现在已经直接通过计算着色器公开了.

That said, it's almost impossible to answer how GLSL executes your shader. It's just a high-level language; the underlying hardware architectures have changed several times over since GLSL was created. The latest generation of hardware has actual branch predication and some pretty complicated threading engines that GLSL 1.10 class hardware never had, some of which is actually exposed directly through compute shaders now.

您可以计算出哪种策略最适合您的硬件,但我认为您会发现这是旧的微优化难题,您甚至可能无法获得足够的可衡量的性能差异来猜测哪种方法采取.请记住,"Uber着色器"之所以具有吸引力是因为多种原因(并非与所有性能相关),其中最重要的一点是,批处理的绘制命令可能越来越少.如果在性能上没有明显的差异,请考虑更简单,更易于实现/维护的设计.

You could run the numbers to see which strategy works best on your hardware, but I think you'll find it's the old micro optimization dilemma and you may not even get enough of a measurable difference in performance to make a guess which approach to take. Keep in mind "Uber shaders" are attractive for multiple reasons (not all performance related), none the least of which, you may have fewer and less complicated draw commands to batch. If there's no appreciable difference in performance consider the design that's simpler and easier to implement / maintain instead.

这篇关于OpenGL GLSL统一分支与多个着色器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆