HLSL分支规避 [英] HLSL branch avoidance

查看：189 发布时间：2020/5/21 20:57:30 optimization branch shader hlsl

本文介绍了HLSL分支规避的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个着色器，我想在其中移动顶点着色器中的一半顶点.我正在尝试从性能的角度决定执行此操作的最佳方法，因为我们处理的转换次数超过100,000，因此速度至关重要.我研究了3种不同的方法:(伪代码，但足以让您理解.<complex formula>我无法给出，但是我可以说它涉及到sin()函数，以及函数调用(只是返回一个数字，但仍然是一个函数调用)，以及一堆有关浮点数的基本算术运算.

I have a shader where I want to move half of the vertices in the vertex shader. I'm trying to decide the best way to do this from a performance standpoint, because we're dealing with well over 100,000 verts, so speed is critical. I've looked at 3 different methods: (pseudo-code, but enough to give you the idea. The <complex formula> I can't give out, but I can say that it involves a sin() function, as well as a function call (just returns a number, but still a function call), as well as a bunch of basic arithmetic on floating point numbers).

if (y < 0.5)
{
    x += <complex formula>;
}

这样的优点是<complex formula>仅执行一半的时间，但是缺点是它肯定会导致分支，实际上它可能比公式慢.它是最易读的，但是在这种情况下，我们更关心速度而不是可读性.

This has the advantage that the <complex formula> is only executed half the time, but the downside is that it definitely causes a branch, which may actually be slower than the formula. It is the most readable, but we care more about speed than readability in this context.

x += step(y, 0.5) * <complex formula>;

使用HLSL的step()函数(如果第一个参数较大则返回0，如果较小则返回1)，可以消除该分支，但是现在每次都调用<complex formula>，并且其结果乘以一半的时间为0(因此浪费了精力).

Using HLSL's step() function (which returns 0 if the first param is greater and 1 if less), you can eliminate the branch, but now the <complex formula> is being called every time, and its results are being multiplied by 0 (thus wasted effort) half of the time.

x += (y < 0.5) ? <complex formula> : 0;

这我不知道. ?:是否引起分支?如果没有，那么方程的两边还是仅相关的那一边?

This I don't know about. Does the ?: cause a branch? And if not, are both sides of the equation evaluated or only the one that is relevant?

最后的可能性是<complex formula>可以卸载到CPU而不是GPU上，但是我担心它在计算sin()和其他操作时会变慢，这可能会导致净损失.而且，这意味着必须再将一个数字传递给着色器，这也可能导致开销.任何人都对哪种方法最好是有所了解?

The final possibility is that the <complex formula> could be offloaded back to the CPU instead of the GPU, but I worry that it will be slower in calculating sin() and other operations, which might result in a net loss. Also, it means one more number has to be passed to the shader, and that could cause overhead as well. Anyone have any insight as to which would be the best course of action?

附录:

根据 http://msdn.microsoft .com/en-us/library/windows/desktop/bb509665％28v = vs.85％29.aspx

step()函数在内部使用了?:，因此它可能不比我的第3个解决方案好，并且可能更糟，因为<complex formula>每次都会被调用，而可能仅用直的?:叫了一半的时间. (到目前为止，还没有人回答问题的那一部分.)尽管避免两者并用:

the step() function uses a ?: internally, so it's probably no better than my 3rd solution, and potentially worse since <complex formula> is definitely called every time, whereas it may be only called half the time with a straight ?:. (Nobody's answered that part of the question yet.) Though avoiding both and using:

x += (1.0 - y) * <complex formula>;

可能比其中任何一个都要好，因为在任何地方都无法进行比较. (并且y始终为0或1.)仍然不必要地执行<complex formula>的一半时间，但可能值得完全避免分支.

may well be better than any of them, since there's no comparison being made anywhere. (And y is always either 0 or 1.) Still executes the <complex formula> needlessly half the time, but might be worth it to avoid branches altogether.

HLSL分支规避 [英] HLSL branch avoidance

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

HLSL分支规避 [英] HLSL branch avoidance

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭