佩恩Hanek算法的实现用C [英] Payne Hanek algorithm implementation in C

查看:407
本文介绍了佩恩Hanek算法的实现用C的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我努力理解如何实现由佩恩和Hanek(范围缩小为三角函数)公布的范围缩小算法

I'm struggling to understand how TO IMPLEMENT the range reduction algorithm published by Payne and Hanek (range reduction for trigonometric functions)

我已经看到有这个库: http://www.netlib.org/fdlibm/

I've seen there's this library: http://www.netlib.org/fdlibm/

不过,在我看来如此扭曲,所有的理论解释我已经创办过于简单,提供一个实现。

But it looks to me so twisted, and all the theoretical explanation i've founded are too simple to provide an implementation.

有没有一些好......好......好它的解释?

Is there some good... good... good explanation of it?

推荐答案

通过佩恩 - Hanek的算法进行参数减少三角函数实际上是pretty的简单。与其他参数减排项目,计算 N = round_nearest(X /(π/ 2)),然后计算通过余x - N *π/ 2 。更高的效率是通过计算实现 N = round_nearest(X *(2 /π))

Performing argument reduction for trigonometric functions via the Payne-Hanek algorithm is actually pretty straightforward. As with other argument reduction schemes, compute n = round_nearest (x / (π/2)), then compute remainder via x - n * π/2. Better efficiency is achieved by computing n = round_nearest (x * (2/π)).

关键的观察佩恩 - Hanek是计算的 X其余的时候 - N *π/ 2 使用完整的圆唇产品加减运算时,领先位取消,所以我们并不需要计算那些。我们只剩下发现的基础上 X 的大小合适的起点(非零位)的问题。如果 X 接近π/ 2 的倍数,可能会有额外的补偿,这是有限的。一个可以参考文献的上限的该取消在这种情况下,附加比特的数量。由于相对较高的计算成本,佩恩 - Hanek通常仅用于参数大的幅度,其中有额外的好处就是加减运算时,原参数的位 X 为零在相关位的位置。

The key observation in Payne-Hanek is that when computing the remainder of x - n * π/2 using the full unrounded product the leading bits cancel during subtraction, so we do not need to compute those. We are left with the problem of finding the right starting point (non-zero bits) based on the magnitude of x. If x is close to a multiple of π/2, there may be additional cancellation, which is limited. One can consult the literature for an upper bound on the number of additional bits that cancel in such cases. Due to relatively high computational cost, Payne-Hanek is usually only used for arguments large in magnitude, which has the additional benefit that during subtraction the bits of the original argument x are zero in the relevant bit positions.

下面我给彻底测试C99 $ C $下单precision SINF(),我最近写了采用佩恩 - Hanek减少慢路径还原,请参阅 trig_red_slowpath_f()。需要注意的是,为了达到一个忠实四舍五入 SINF()一个人必须要增加参数减少到减少参数返回两个浮动的操作数在头/尾的方式。

Below I show exhaustively tested C99 code for single-precision sinf() that I wrote recently that incorporates Payne-Hanek reduction in the slow path of the reduction, see trig_red_slowpath_f(). Note that in order to achieve a faithfully rounded sinf() one would have to augment the argument reduction to return the reduced argument as two float operands in head/tail fashion.

各种设计选择是可能的,下面我以便最小化的存储 2 /π的所需的位选择了主要基于整数的运算。使用浮点运算和重叠的对或浮点数字的三倍来存储的位实现 2 /π也很常见。

Various design choices are possible, below I opted for largely integer-based computation in order to minimize the storage for the required bits of 2/π. Implementations using floating-point computation and overlapped pairs or triples of floating-point numbers to store the bits of 2/π are also common.

/* 190 bits of 2/pi for Payne-Hanek style argument reduction. */
static const unsigned int two_over_pi_f [] = 
{
    0x00000000,
    0x28be60db,
    0x9391054a,
    0x7f09d5f4,
    0x7d4d3770,
    0x36d8a566,
    0x4f10e410
};

float trig_red_slowpath_f (float a, int *quadrant)
{
    unsigned long long int p;
    unsigned int ia, hi, mid, lo, tmp, i;
    int e, q;
    float r;

    ia = (unsigned int)(fabsf (frexpf (a, &e)) * 0x1.0p32f);

    /* extract 96 relevant bits of 2/pi based on magnitude of argument */ 
    i = (unsigned int)e >> 5;
    e = (unsigned int)e & 31;

    hi  = two_over_pi_f [i+0];
    mid = two_over_pi_f [i+1];
    lo  = two_over_pi_f [i+2];
    tmp = two_over_pi_f [i+3];

    if (e) {
        hi  = (hi  << e) | (mid >> (32 - e));
        mid = (mid << e) | (lo  >> (32 - e));
        lo  = (lo  << e) | (tmp >> (32 - e));
    }

    /* compute product x * 2/pi in 2.62 fixed-point format */
    p = (unsigned long long int)ia * lo;
    p = (unsigned long long int)ia * mid + (p >> 32);
    p = ((unsigned long long int)(ia * hi) << 32) + p;

    /* round quotient to nearest */
    q = (int)(p >> 62);                // integral portion = quadrant<1:0>
    p = p & 0x3fffffffffffffffULL;     // fraction
    if (p & 0x2000000000000000ULL) {   // fraction >= 0.5
        p = p - 0x4000000000000000ULL; // fraction - 1.0
        q = q + 1;
    }

    /* compute remainder of x / (pi/2) */
    double d;

    d = (double)(long long int)p;
    d = d * 0x1.921fb54442d18p-62; // 1.5707963267948966 * 0x1.0p-62
    r = (float)d;
    if (a < 0.0f) {
        r = -r;
        q = -q;
    }

    *quadrant = q;
    return r;
}

/* Like rintf(), but -0.0f -> +0.0f, and |a| must be <= 0x1.0p+22 */
float quick_and_dirty_rintf (float a)
{
    float cvt_magic = 0x1.800000p+23f;
    return (a + cvt_magic) - cvt_magic;
}

/* Argument reduction for trigonometric functions that reduces the argument
   to the interval [-PI/4, +PI/4] and also returns the quadrant. It returns 
   -0.0f for an input of -0.0f 
*/
float trig_red_f (float a, float switch_over, int *q)
{    
    float j, r;

    if (fabsf (a) > switch_over) {
        /* Payne-Hanek style reduction. M. Payne and R. Hanek, Radian reduction
           for trigonometric functions. SIGNUM Newsletter, 18:19-24, 1983
        */
        r = trig_red_slowpath_f (a, q);
    } else {
        /* FMA-enhanced Cody-Waite style reduction. W. J. Cody and W. Waite, 
           "Software Manual for the Elementary Functions", Prentice-Hall 1980
        */
        j = 0x1.45f306p-1f * a;             // 2/pi
        j = quick_and_dirty_rintf (j);
        r = fmaf (j, -0x1.921fb0p+00f, a);  // pio2_high
        r = fmaf (j, -0x1.5110b4p-22f, r);  // pio2_mid
        r = fmaf (j, -0x1.846988p-48f, r);  // pio2_low
        *q = (int)j;
    }
    return r;
}

/* Approximate sine on [-PI/4,+PI/4]. Maximum ulp error = 0.64721
   Returns -0.0f for an argument of -0.0f
   Polynomial approximation based on unpublished work by T. Myklebust
*/
float sinf_poly (float a, float s)
{
    float r;

    r =              0x1.7d3bbcp-19f;
    r = fmaf (r, s, -0x1.a06bbap-13f);
    r = fmaf (r, s,  0x1.11119ap-07f);
    r = fmaf (r, s, -0x1.555556p-03f);
    r = r * s + 0.0f; // ensure -0 is passed trough
    r = fmaf (r, a, a);
    return r;
}

/* Approximate cosine on [-PI/4,+PI/4]. Maximum ulp error = 0.87531 */
float cosf_poly (float s)
{
    float r;

    r =              0x1.98e616p-16f;
    r = fmaf (r, s, -0x1.6c06dcp-10f);
    r = fmaf (r, s,  0x1.55553cp-05f);
    r = fmaf (r, s, -0x1.000000p-01f);
    r = fmaf (r, s,  0x1.000000p+00f);
    return r;
}

/* Map sine or cosine value based on quadrant */
float sinf_cosf_core (float a, int i)
{
    float r, s;

    s = a * a;
    r = (i & 1) ? cosf_poly (s) : sinf_poly (a, s);
    if (i & 2) {
        r = 0.0f - r; // don't change "sign" of NaNs
    }
    return r;
}

/* maximum ulp error = 1.49241 */
float my_sinf (float a)
{
    float r;
    int i;

    a = a * 0.0f + a; // inf -> NaN
    r = trig_red_f (a, 117435.992f, &i);
    r = sinf_cosf_core (r, i);
    return r;
}

/* maximum ulp error = 1.49510 */
float my_cosf (float a)
{
    float r;
    int i;

    a = a * 0.0f + a; // inf -> NaN
    r = trig_red_f (a, 71476.0625f, &i);
    r = sinf_cosf_core (r, i + 1);
    return r;
}

这篇关于佩恩Hanek算法的实现用C的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆