为什么这个SOR解算器的速度取决于输入? [英] Why does the speed of this SOR solver depend on the input?

查看:139
本文介绍了为什么这个SOR解算器的速度取决于输入?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

与我的其他问题相关,我现在已经修改稀疏矩阵求解器以使用SOR(连续过松弛)方法。代码如下:

  void SORSolver :: step(){
float const omega = 1.0f;
float const
* b =& d_b(1,1),
* w =& d_w(1,1),* e =& d_e(1,1) * s =& d_s(1,1),* n =& d_n(1,1),
* xw =& d_x(0,1),* xe =& d_x ),* xs =& d_x(1,0),* xn =& d_x(1,2);
float * xc =& d_x(1,1);
for(size_t y = 1; y< d_ny - 1; ++ y){
for(size_t x = 1; x< d_nx - 1; ++ x){
float diff = * b
- * xc
- * e * * xe
- * s * * xs - * n * * xn
- * w * * xw;
* xc + = omega * diff;
++ b;
++ w; ++ e; ++ s; ++ n;
++ xw; ++ xe; ++ xs; ++ xn;
++ xc;
}
b + = 2;
w + = 2; e + = 2; s + = 2; n + = 2;
xw + = 2; xe + = 2; xs + = 2; xn + = 2;
xc + = 2;
}
}

现在奇怪的是:如果我增加 omega (松弛因子),执行速度开始显着地依赖于各种数组内的值!



对于 omega = 1.0f ,执行时间或多或少是恒定的。对于 omega = 1.8 ,第一次,通常需要5毫秒执行此 step() 10次​​,但在模拟期间这将逐渐增加到近100 ms。如果我设置 omega = 1.0001f ,我看到相应的执行时间稍微增加; omega 越高,模拟期间执行时间就越快。



流体解算器,很难想出一个独立的例子。但是我已经保存了初始状态,并且在每个步骤上重新运行该状态的求解器,以及求解实际的时间步长。对于初始状态,它是快速的,对于随后的时间步长递增地慢。因为所有其他都是相等的,这证明了这个代码的执行速度取决于这六个数组中的值。



这是可重复的Ubuntu的g ++,以及



我听说NaN和Inf值对于浮点计算可能较慢,但没有NaN或Infs存在。浮动计算的速度是否可能取决于输入数字的值?

解决方案

问题是是 - 反规范化(非常接近于零)数字需要特殊处理,可以慢得多。我的猜测是,他们随着时间的推移进入模拟。查看此相关的SO职位: http://stackoverflow.com/questions/2051534/floating-point-数学执行时间



设置浮点控制以将异常数清零为零应该考虑对仿真质量可忽略不计的事情。 / p>

Related to my other question, I have now modified the sparse matrix solver to use the SOR (Successive Over-Relaxation) method. The code is now as follows:

void SORSolver::step() {
    float const omega = 1.0f;
    float const
        *b = &d_b(1, 1),
        *w = &d_w(1, 1), *e = &d_e(1, 1), *s = &d_s(1, 1), *n = &d_n(1, 1),
        *xw = &d_x(0, 1), *xe = &d_x(2, 1), *xs = &d_x(1, 0), *xn = &d_x(1, 2);
    float *xc = &d_x(1, 1);
    for (size_t y = 1; y < d_ny - 1; ++y) {
        for (size_t x = 1; x < d_nx - 1; ++x) {
            float diff = *b
                - *xc
                - *e * *xe
                - *s * *xs - *n * *xn
                - *w * *xw;
            *xc += omega * diff;
            ++b;
            ++w; ++e; ++s; ++n;
            ++xw; ++xe; ++xs; ++xn;
            ++xc;
        }
        b += 2;
        w += 2; e += 2; s += 2; n += 2;
        xw += 2; xe += 2; xs += 2; xn += 2;
        xc += 2;
    }
}

Now the weird thing is: if I increase omega (the relaxation factor), the execution speed starts to depend dramatically on the values inside the various arrays!

For omega = 1.0f, the execution time is more or less constant. For omega = 1.8, the first time, it will typically take, say, 5 milliseconds to execute this step() 10 times, but this will gradually increase to nearly 100 ms during the simulation. If I set omega = 1.0001f, I see an accordingly slight increase in execution time; the higher omega goes, the faster execution time will increase during the simulation.

Since all this is embedded inside the fluid solver, it's hard to come up with a standalone example. But I have saved the initial state and rerun the solver on that state every time step, as well as solving for the actual time step. For the initial state it was fast, for the subsequent time steps incrementally slower. Since all else is equal, that proves that the execution speed of this code is dependent on the values in those six arrays.

This is reproducible on Ubuntu with g++, as well as on 64-bit Windows 7 when compiling for 32-bits with VS2008.

I heard that NaN and Inf values can be slower for floating point calculations, but no NaNs or Infs are present. Is it possible that the speed of float computations otherwise depends on the values of the input numbers?

解决方案

The short answer to your last question is "yes" - denormalized (very close to zero) numbers require special handling and can be much slower. My guess is that they're creeping into the simulation as time goes on. See this related SO post: http://stackoverflow.com/questions/2051534/floating-point-math-execution-time

Setting the floating-point control to flush denormals to zero should take care of things with a negligible imapct on the simulation quality.

这篇关于为什么这个SOR解算器的速度取决于输入?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆