使用标准C数学库实现sinpi()和cospi() [英] Implementation of sinpi() and cospi() using standard C math library

查看:261
本文介绍了使用标准C数学库实现sinpi()和cospi()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

函数 sinpi(x)计算sin(πx),函数 cospi(x) (πx),其中与π相乘隐含在函数内部。这些函数最初作为Sun Microsystems的扩展引入到C标准数学库中,在 20世纪80年代后期。 IEEE Std 754™-2008在第9节中规定了相同的功能 sinPi cosPi



有许多计算,其中sin(πx)和cos(πx)自然出现。一个非常简单的例子是Box-Muller变换(GEP Box和Mervin E.Muller,关于随机正态偏差生成的注释,数理统计年鉴,第29卷,第给出两个具有均匀分布的独立随机变量U 1和U 2,产生具有标准正态分布的独立随机变量Z 1和Z 2:

<$ p (2πU 2)
Z 2 =√(-2 ln U 1)sin(2πU 2)
c $ c>

另外一个例子是度量参数的正弦和余弦的计算,就像使用Haversine公式计算大圆距离一样:
$ b

/ *这个函数计算地球上两个点的大圆距离
使用Haversine公式,假设球体的球形。与公式相关的一个
数值问题是在接近对数点的
情况下精度降低。

lat1,lon1第一个点的纬度和经度,单位为度[-90,+ 90]
lat2,lon2第二个点的纬度和经度,单位为度[-180,+ 180]用户自定义单位的地球半径半径b b $ b,例如6378.2公里或
3963.2英里

回报:两点之间的距离,半径相同单位

参考文献:http://en.wikipedia.org / wiki / Great-circle_distance
* /
double haversine(double lat1,double lon1,double lat2,double lon2,double radius)
{
double dlat,dlon,c1, c2,d1,d2,a,c,t;

c1 = cospi(lat1 / 180.0);
c2 = cospi(lat2 / 180.0);
dlat = lat2 - lat1;
dlon = lon2 - lon1;
d1 = sinpi(dlat / 360.0);
d2 = sinpi(dlon / 360.0);
t = d2 * d2 * c1 * c2;
a = d1 * d1 + t;
c = 2.0 * asin(fmin(1.0,sqrt(a)));
返回半径* c;对于C ++,Boost库提供了

www.boost.org/doc/libs/1_63_0/libs/math/doc/html/math_toolkit/powers/sin_pi.htmlrel =nofollow noreferrer> sin_pi
cos_pi ,一些供应商提供 sinpi cospi 功能作为系统库中的扩展。例如,Apple添加了 __ sinpi __ cospi 和相应的单精度版本 __ sinpif __ cospif 到iOS 7和OS X 10.9( presentation ,幻灯片101)。但是对于其他许多平台,C程序并没有很容易实现的实现。



与传统方法相比, sin(M_PI * x) cos(M_PI * x),使用 sinpi cospi 通过内部乘以π来减小舍入误差,从而提高了准确性,并且由于如何使用标准C数学库实现 sinpi()和<$> c $ c> cospi()功能是否合理有效且符合标准的兼容方式? 简单,我将重点介绍 sincospi(),它同时提供正弦和余弦结果。然后可以构造 sinpi cospi 作为丢弃不需要的数据的包装函数。在许多应用程序中,不需要处理浮点标志(请参阅 fenv.h ),也不需要 errno 大部分时间报错,所以我会省略这些。

基本的算法结构非常简单。由于非常大的参数总是偶数,因此是2π的倍数,所以它们的正弦和余弦值是众所周知的。其他参数在记录象限信息的同时被折叠到范围[-¼,+¼]中。使用多项式来计算主近似间隔上的正弦和余弦。使用多项式,可以使用多项式近似值来计算主近似间隔上的正弦和余弦。最后,使用象限数据通过循环交换结果和符号变化将初步结果映射到最终结果。

正确处理特殊操作数(特别是-0 ,infinities和NaN)要求编译器只应用符合IEEE-754规则的优化。它不能将 x * 0.0 转换为 0.0 (这对于-0,infinities和NaN不正确)可以将 0.0-x 优化为 -x ,因为否定是根据5.5.1节的比特级操作IEEE-754(对于零和NaN产生不同的结果)。大多数编译器将提供一个强制使用安全转换的标志,例如Intel C / C ++编译器的 -fp-model = precise

另外一个注意事项适用于减少参数时使用 nearbyint 函数。像 rint 一样,此函数被指定为根据当前舍入模式舍入。当不使用 fenv.h 时,舍入模式默认为舍入为最接近或最接近。使用时,有可能导致有效的舍入模式。这可以通过使用 round 来解决,它独立于当前的舍入模式,总是提供舍入模式round to nearest nearest,zero from zero。然而,这个函数往往会比较慢,因为它在大多数处理器架构上都没有被等价的机器指令支持。

关于性能的一个注释:下面的C99代码很大程度上依赖于使用 fma(),它实现了融合乘加操作。在大多数现代硬件体系结构中,这通过相应的硬件指令直接支持。如果情况并非如此,代码可能由于通常较慢的FMA仿真而经历显着的减速。

#include< math.h>
#include< stdint.h>

将结果正弦结果sin(πa)写入sp
指向的位置。将结果余弦结果cos(πa)写入cp
$ b指向的位置$ b在广泛的测试中,没有错误>在正弦
或余弦结果中发现0.97 ulp,表明返回的结果是忠实的四舍五入。
* /
void my_sincospi(double a,double * sp,double * cp)
{
double c,r,s,t,az;
int64_t i;

az = a * 0.0; //必须使用IEEE-754语义
/ *对| a |进行评估(a)= 10 **,cospi(a)= 1.0,但cospi(Inf)= NaN * /
a =(fabs(a)<9.0071992547409920e + 15) a:az; // 0x1.0p53
/ *将参数简化为主近似间隔(-0.25,0.25)* /
r = nearbyint(a + a); //必须使用IEEE-754到最近四舍五入
i =(int64_t)r;
t = fma(-0.5,r,a);
/ *计算核心近似值* /
s = t * t;
/ *在[-0.25,0.25] * /
r = -1.0369917389758117e-4中x的近似cos(pi * x)
r = fma(r,s,1.9294935641298806e-3);
r = fma(r,s,-2.5806887942825395e-2);
r = fma(r,s,2.3533063028328211e-1);
r = fma(r,s,-1.3352627688538006e + 0);
r = fma(r,s,4.0587121264167623e + 0);
r = fma(r,s,-4.9348022005446790e + 0);
c = fma(r,s,1.0000000000000000e + 0);
/ *在[-0.25,0.25] * /
r = 4.6151442520157035e-4中x的近似sin(pi * x)
r = fma(r,s,-7.3700183130883555e-3);
r = fma(r,s,8.2145868949323936e-2);
r = fma(r,s,-5.9926452893214921e-1);
r = fma(r,s,2.5501640398732688e + 0);
r = fma(r,s,-5.1677127800499516e + 0);
s = s * t;
r = r * s;
s = fma(t,3.1415926535897931e + 0,r);
/ *根据象限计算结果* /
if(i& 2){
s = 0.0 - s; //必须使用IEEE-754语义进行评估
c = 0.0 - c;如果(i& 1){
t = 0.0-s; //必须用IEEE-754语义学来评估
}
; //必须用IEEE-754语义进行评估
s = c;
c = t;

/ * IEEE-754:sinPi(+ n)为+0且sinPi(-n)为-0,对于正整数n * /
if(a == floor(a ))s = az;
* sp = s;
* cp = c;
}

单精度版本基本上只在核心逼近方面有所不同。使用穷举测试可以精确确定错误范围。

#include< math.h>
#include< stdint.h>

将结果正弦结果sin(πa)写入sp
指向的位置。将结果余弦结果cos(πa)写入cp
$ b指向的位置$ b在详尽的测试中,正弦结果的最大误差是0.96677 ulp,余弦结果的最大误差是$ 0.96563 ulp,这意味着结果是
忠实四舍五入。
* /
void my_sincospif(float a,float * sp,float * cp)
{
float az,t,c,r,s;
int32_t i;

az = a * 0.0f; //必须使用IEEE-754语义
/ *对| a |进行评估> 2 ** 24,cospi(a)= 1.0f,但cospi(Inf)= NaN * /
a =(fabsf(a)<0x1.0p24f)? a:az;
r = nearbyintf(a + a); //必须使用IEEE-754到最近四舍五入
i =(int32_t)r;
t = fmaf(-0.5f,r,a);
/ *计算核心近似值* /
s = t * t;
/ *在[-0.25,0.25] * /
r = 0x1.d9e000p-3f中x的近似cos(pi * x)
r = fmaf(r,s,-0x1.55c400p + 0f);
r = fmaf(r,s,0x1.03c1cep + 2f);
r = fmaf(r,s,-0x1.3bd3ccp + 2f);
c = fmaf(r,s,0x1.000000p + 0f);
/ *在[-0.25,0.25] * /
r = -0x1.310000p-1f中近似的sin(pi * x)
r = fmaf(r,s,0x1.46737ep + 1f);
r = fmaf(r,s,-0x1.4abbfep + 2f);
r =(t * s)* r;
s = fmaf(t,0x1.921fb6p + 1f,r);
if(i& 2){
s = 0.0f - s; //必须用IEEE-754语义进行评估
c = 0.0f - c; //如果(i& 1){
t = 0.0f-s; //必须用IEEE-754语义
}
来评估。 //必须用IEEE-754语义进行评估
s = c;
c = t;如果(a == floorf(a),那么IEEE-754:sinPi(+ n)为+0,而sinPi(-n)为-0为正整数n * /
。 ))s = az;
* sp = s;
* cp = c;
}


The function sinpi(x) computes sin(πx), and the function cospi(x) computes cos(πx), where the multiplication with π is implicit inside the functions. These functions were initially introduced into the C standard math library as an extension by Sun Microsystems in the late 1980s. IEEE Std 754™-2008 specifies the equivalent functions sinPi and cosPi in section 9.

There are numerous computations where sin(πx) and cos(πx) occur naturally. A very simple example is the Box-Muller transform (G. E. P. Box and Mervin E. Muller, "A Note on the Generation of Random Normal Deviates". The Annals of Mathematical Statistics, Vol. 29, No. 2, pp. 610 - 611), which, given two independent random variables U₁ and U₂ with uniform distribution, produces independent random variables Z₁ and Z₂ with standard normal distribution:

Z₁ = √(-2 ln U₁) cos (2 π U₂)
Z₂ = √(-2 ln U₁) sin (2 π U₂)

A further example is the computation of sine and cosine for degree arguments, as in this computation of great-circle distance using the Haversine formula:

/* This function computes the great-circle distance of two points on earth 
   using the Haversine formula, assuming spherical shape of the planet. A 
   well-known numerical issue with the formula is reduced accuracy in the 
   case of near antipodal points.

   lat1, lon1  latitude and longitude of first point, in degrees [-90,+90]
   lat2, lon2  latitude and longitude of second point, in degrees [-180,+180]
   radius      radius of the earth in user-defined units, e.g. 6378.2 km or 
               3963.2 miles

   returns:    distance of the two points, in the same units as radius

   Reference: http://en.wikipedia.org/wiki/Great-circle_distance
*/
double haversine (double lat1, double lon1, double lat2, double lon2, double radius)
{
    double dlat, dlon, c1, c2, d1, d2, a, c, t;

    c1 = cospi (lat1 / 180.0);
    c2 = cospi (lat2 / 180.0);
    dlat = lat2 - lat1;
    dlon = lon2 - lon1;
    d1 = sinpi (dlat / 360.0);
    d2 = sinpi (dlon / 360.0);
    t = d2 * d2 * c1 * c2;
    a = d1 * d1 + t;
    c = 2.0 * asin (fmin (1.0, sqrt (a)));
    return radius * c;
}

For C++, the Boost library provides sin_pi and cos_pi, and some vendors offer sinpi and cospi functionality as extensions in system libraries. For example, Apple added __sinpi, __cospi and the corresponding single-precision versions __sinpif, __cospif to iOS 7 and OS X 10.9 (presentation, slide 101). But for many other platforms, there is no implementation readily accessible to C programs.

Compared with a traditional approach that uses e.g. sin (M_PI * x) and cos (M_PI * x), the use of sinpi and cospi improves accuracy by reducing rounding error via the internal multiplication with π, and also offers performance advantages due to the much simpler argument reduction.

How can one use the standard C math library to implement sinpi() and cospi() functionality in a reasonably efficient and standard compliant fashion?

解决方案

For simplicity, I will focus on sincospi(), which simultaneously provides both the sine and the cosine results. sinpi and cospi can then be constructed as wrapper functions that discard unneeded data. In many applications, the handling of floating-point flags (see fenv.h) is not required, nor do we need errno error reporting most of the time, so I will omit these.

The basic algorithmic structure is straightforward. As very large arguments are always even integers, and therefore thus multiples of 2π, their sine and cosine values are well-known. Other arguments are folded into range [-¼,+¼] while recording quadrant information. Polynomial minimax approximations are used to compute sine and cosine on the primary approximation interval. Finally, quadrant data is used to map the preliminary results to the final result by cyclical exchange of results and sign change.

The correct handling of special operands (in particular -0, infinities, and NaNs) requires the compiler to apply only optimizations that comply with IEEE-754 rules. It may not transform x*0.0 into 0.0 (this is not correct for -0, infinities, and NaNs) nor may it optimize 0.0-x into -x as negation is a bit-level operation according to section 5.5.1 of IEEE-754 (yielding different results for zeros and NaNs). Most compilers will offer a flag that enforces the use of "safe" transformations, e.g. -fp-model=precise for the Intel C/C++ compiler.

One additional caveat applies to the use of the nearbyint function during argument reduction. Like rint, this function is specified to round according to the current rounding mode. When fenv.h isn't used, the rounding mode defaults to round "to-nearest-or-even". When it is used, there is a risk that a directed rounding mode is in effect. This could be worked around by the use of round, which always provides the rounding mode "round to nearest, ties away from zero" independent of current rounding mode. However, this function will tend to be slower since it is not supported by an equivalent machine instruction on most processor architectures.

A note on performance: The C99 code below relies heavily on the use of fma(), which implements a fused multiply-add operation. On most modern hardware architectures, this is directly supported by a corresponding hardware instruction. Where this is not the case, the code may experience significant slow-down due to generally slow FMA emulation.

 #include <math.h>
 #include <stdint.h>

/* Writes result sine result sin(πa) to the location pointed to by sp
   Writes result cosine result cos(πa) to the location pointed to by cp

   In extensive testing, no errors > 0.97 ulp were found in either the sine
   or cosine results, suggesting the results returned are faithfully rounded.
*/
void my_sincospi (double a, double *sp, double *cp)
{
    double c, r, s, t, az;
    int64_t i;

    az = a * 0.0; // must be evaluated with IEEE-754 semantics
    /* for |a| >= 2**53, cospi(a) = 1.0, but cospi(Inf) = NaN */
    a = (fabs (a) < 9.0071992547409920e+15) ? a : az;  // 0x1.0p53
    /* reduce argument to primary approximation interval (-0.25, 0.25) */
    r = nearbyint (a + a); // must use IEEE-754 "to nearest" rounding
    i = (int64_t)r;
    t = fma (-0.5, r, a);
    /* compute core approximations */
    s = t * t;
    /* Approximate cos(pi*x) for x in [-0.25,0.25] */
    r =            -1.0369917389758117e-4;
    r = fma (r, s,  1.9294935641298806e-3);
    r = fma (r, s, -2.5806887942825395e-2);
    r = fma (r, s,  2.3533063028328211e-1);
    r = fma (r, s, -1.3352627688538006e+0);
    r = fma (r, s,  4.0587121264167623e+0);
    r = fma (r, s, -4.9348022005446790e+0);
    c = fma (r, s,  1.0000000000000000e+0);
    /* Approximate sin(pi*x) for x in [-0.25,0.25] */
    r =             4.6151442520157035e-4;
    r = fma (r, s, -7.3700183130883555e-3);
    r = fma (r, s,  8.2145868949323936e-2);
    r = fma (r, s, -5.9926452893214921e-1);
    r = fma (r, s,  2.5501640398732688e+0);
    r = fma (r, s, -5.1677127800499516e+0);
    s = s * t;
    r = r * s;
    s = fma (t, 3.1415926535897931e+0, r);
    /* map results according to quadrant */
    if (i & 2) {
        s = 0.0 - s; // must be evaluated with IEEE-754 semantics
        c = 0.0 - c; // must be evaluated with IEEE-754 semantics
    }
    if (i & 1) { 
        t = 0.0 - s; // must be evaluated with IEEE-754 semantics
        s = c;
        c = t;
    }
    /* IEEE-754: sinPi(+n) is +0 and sinPi(-n) is -0 for positive integers n */
    if (a == floor (a)) s = az;
    *sp = s;
    *cp = c;
}

The single-precision version differs basically only in the core approximations. Using exhaustive testing allows the precise determination of errors bounds.

#include <math.h>
#include <stdint.h>

/* Writes result sine result sin(πa) to the location pointed to by sp
   Writes result cosine result cos(πa) to the location pointed to by cp

   In exhaustive testing, the maximum error in sine results was 0.96677 ulp,
   the maximum error in cosine results was 0.96563 ulp, meaning results are
   faithfully rounded.
*/
void my_sincospif (float a, float *sp, float *cp)
{
    float az, t, c, r, s;
    int32_t i;

    az = a * 0.0f; // must be evaluated with IEEE-754 semantics
    /* for |a| > 2**24, cospi(a) = 1.0f, but cospi(Inf) = NaN */
    a = (fabsf (a) < 0x1.0p24f) ? a : az;
    r = nearbyintf (a + a); // must use IEEE-754 "to nearest" rounding
    i = (int32_t)r;
    t = fmaf (-0.5f, r, a);
    /* compute core approximations */
    s = t * t;
    /* Approximate cos(pi*x) for x in [-0.25,0.25] */
    r =              0x1.d9e000p-3f;
    r = fmaf (r, s, -0x1.55c400p+0f);
    r = fmaf (r, s,  0x1.03c1cep+2f);
    r = fmaf (r, s, -0x1.3bd3ccp+2f);
    c = fmaf (r, s,  0x1.000000p+0f);
    /* Approximate sin(pi*x) for x in [-0.25,0.25] */
    r =             -0x1.310000p-1f;
    r = fmaf (r, s,  0x1.46737ep+1f);
    r = fmaf (r, s, -0x1.4abbfep+2f);
    r = (t * s) * r;
    s = fmaf (t, 0x1.921fb6p+1f, r);
    if (i & 2) {
        s = 0.0f - s; // must be evaluated with IEEE-754 semantics
        c = 0.0f - c; // must be evaluated with IEEE-754 semantics
    }
    if (i & 1) {
        t = 0.0f - s; // must be evaluated with IEEE-754 semantics
        s = c;
        c = t;
    }
    /* IEEE-754: sinPi(+n) is +0 and sinPi(-n) is -0 for positive integers n */
    if (a == floorf (a)) s = az;
    *sp = s;
    *cp = c;
}

这篇关于使用标准C数学库实现sinpi()和cospi()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆