如何在处理负零时有效地比较两个浮点值的符号 [英] How to efficiently compare the sign of two floating-point values while handling negative zeros

查看:137
本文介绍了如何在处理负零时有效地比较两个浮点值的符号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定两个浮点数,我正在寻找一种高效方式来检查它们是否具有相同的符号,假设两个值中的任何一个为零(+0.0



例如,

ul>
  • SameSign(1.0,2.0)应该返回true

  • SameSign(-1.0,-2.0)应该返回true

  • SameSign(-1.0,2.0)应该返回false

  • b> SameSign(0.0,-1.0)应该返回真实
  • SameSign(-0.0,1.0)应该返回true $ b
  • SameSign(-0.0,-1.0)应该返回true



  • 在C ++中执行 SameSign 将是:

      bool SameSign ,float b)
    {
    if(fabs(a)== 0.0f || fabs(b)== 0.0f)
    return true;

    return(a> = 0.0f)==(b> = 0.0f);
    }

    假设是IEEE浮点模型, c> SameSign ,编译成无分支代码(至少用Visual C ++ 2008):

      bool SameSign(float a,float b)
    {
    int ia = binary_cast< int>(a);
    int ib = binary_cast< int>(b);

    int az =(ia& 0x7FFFFFFF)== 0;
    int bz =(ib& 0x7FFFFFFF)== 0;
    int ab =(ia ^ ib)> = 0;

    return(az | bz | ab)!= 0;
    }

    binary_cast 如下:

      template< typename Target,typename Source> 
    inline目标binary_cast(源代码)
    {
    联合
    {
    源m_source;
    Target m_target;
    } u;
    u.m_source = s;
    return u.m_target;
    }

    我在寻找两件事:


    1. 使用位技巧更快,更有效地实现 SameSign


    2. SameSign 有效扩展为三个值


    编辑:



    ve对 SameSign (原始问题中描述的两个变体,加上Stephen的变体)的三个变体进行了一些性能测量。每个函数在由-1.0,-0.0,+ 0.0和+1.0随机填充的101个浮点数组中的所有连续对值上运行200-400次。每次测量重复2000次,并保持最小时间(以清除所有缓存效应和系统诱导的减速)。该代码是使用Visual C ++ 2008 SP1编译的,具有最大优化和SSE2代码生成。测量是在Core 2 Duo P8600 2.4 Ghz上进行的。



    这里是计时,不计算从数组中获取输入值,调用函数和检索




    • 原始变体:15分钟

    • Bit magic


    解决方案

    如果您不需要支持无穷大,可以使用:

      inline bool SameSign(float a,float b){
    return a * b> = 0.0f;
    }

    这在大多数现代硬件上实际上是相当快的,它在(零,无穷大)情况下不能正常工作,因为零*无穷大是NaN,并且无论符号如何,比较将返回假。当a和b都很小时,它也会在某些硬件上出现异常停止。


    Given two floating-point numbers, I'm looking for an efficient way to check if they have the same sign, given that if any of the two values is zero (+0.0 or -0.0), they should be considered to have the same sign.

    For instance,

    • SameSign(1.0, 2.0) should return true
    • SameSign(-1.0, -2.0) should return true
    • SameSign(-1.0, 2.0) should return false
    • SameSign(0.0, 1.0) should return true
    • SameSign(0.0, -1.0) should return true
    • SameSign(-0.0, 1.0) should return true
    • SameSign(-0.0, -1.0) should return true

    A naive but correct implementation of SameSign in C++ would be:

    bool SameSign(float a, float b)
    {
        if (fabs(a) == 0.0f || fabs(b) == 0.0f)
            return true;
    
        return (a >= 0.0f) == (b >= 0.0f);
    }
    

    Assuming the IEEE floating-point model, here's a variant of SameSign that compiles to branchless code (at least with with Visual C++ 2008):

    bool SameSign(float a, float b)
    {
        int ia = binary_cast<int>(a);
        int ib = binary_cast<int>(b);
    
        int az = (ia & 0x7FFFFFFF) == 0;
        int bz = (ib & 0x7FFFFFFF) == 0;
        int ab = (ia ^ ib) >= 0;
    
        return (az | bz | ab) != 0;
    }
    

    with binary_cast defined as follow:

    template <typename Target, typename Source>
    inline Target binary_cast(Source s)
    {
        union
        {
            Source  m_source;
            Target  m_target;
        } u;
        u.m_source = s;
        return u.m_target;
    }
    

    I'm looking for two things:

    1. A faster, more efficient implementation of SameSign, using bit tricks, FPU tricks or even SSE intrinsics.

    2. An efficient extension of SameSign to three values.

    Edit:

    I've made some performance measurements on the three variants of SameSign (the two variants described in the original question, plus Stephen's one). Each function was run 200-400 times, on all consecutive pairs of values in an array of 101 floats filled at random with -1.0, -0.0, +0.0 and +1.0. Each measurement was repeated 2000 times and the minimum time was kept (to weed out all cache effects and system-induced slowdowns). The code was compiled with Visual C++ 2008 SP1 with maximum optimization and SSE2 code generation enabled. The measurements were done on a Core 2 Duo P8600 2.4 Ghz.

    Here are the timings, not counting the overhead of fetching input values from the array, calling the function and retrieving the result (which amount to 6-7 clockticks):

    • Naive variant: 15 ticks
    • Bit magic variant: 13 ticks
    • Stephens's variant: 6 ticks

    解决方案

    If you don't need to support infinities, you can just use:

    inline bool SameSign(float a, float b) {
        return a*b >= 0.0f;
    }
    

    which is actually pretty fast on most modern hardware, and is completely portable. It doesn't work properly in the (zero, infinity) case however, because zero * infinity is NaN, and the comparison will return false, regardless of the signs. It will also incur a denormal stall on some hardware when a and b are both tiny.

    这篇关于如何在处理负零时有效地比较两个浮点值的符号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆