用C标准数学库精确计算标准正态分布的PDF [英] Accurate computation of PDF of standard normal distribution using C standard math library
问题描述
$ b $ pre $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $返回0x1 .988454p-2f * my_expf(-0.5f * a * a); / * 1 / sqrt(2 * pi)* /
}
没有过早的下溢,因为在计算α/ 2/2时产生的误差被随后的幂乘所放大,所以存在准确性问题。人们可以通过测试更高精度的参考来轻松演示这一点。确切的错误将根据所使用的 exp()
或 expf()
实现的准确性而有所不同。对于忠实的四舍五入的指数函数,人们通常会观察到对于IEEE-754 binary32
单个精度的大约2 <6> ulps的最大误差,大约2 对于IEEE-754 binary64
双精度。
如何解决准确性问题一个合理有效的方式?一个简单的方法是采用更高精度的中间计算,例如对 float
实现使用 double
计算。但是,如果浮点运算的精度较高,不容易获得,那么对于 double
实现,这种方法不起作用,而对于 float <如果
计算要昂贵得多,例如code>在许多GPU上。 double
算法比$ code> float
问题中提出的准确性问题可以有效且高效地通过使用有限的数量通过使用融合乘法 - 加法(FMA)操作来实现双重的 float
或double - double
。
通过标准的数学函数 fmaf(a,b),可以从
和 C99
,c) fma(a,b,c)
,计算a * b + c,而不舍入中间乘积的。虽然这些功能可以直接映射到几乎所有现代处理器上的快速硬件操作,但它们可能会在较早的平台上使用仿真代码,在这种情况下,它们可能会非常慢。
允许使用两次操作以正常精度的两倍计算产品,从而产生一个头部:尾部的原始精度数字:
prod_hi = a * b //头部
prod_lo = FMA(a,b,-hi)//尾部
结果的高阶比特可以传递给幂运算,而低阶比特用于通过线性插值来提高结果的精度,利用了e < (b
e = exp(prod_hi)+ exp(prod_hi)* prod_lo / / exp(a * b)
这使我们能够消除天真实现的大部分错误。计算误差的另一个来源是常数1 /√(2π)所表示的有限精度。这可以通过使用提供两倍本地精度的常量head:tail表示来进行改进,并计算:
r = FMA(const_hi,x,const_lo * x)// const * x
以下文章指出这种技术甚至可以对某些任意精度的常量进行正确的四舍五入:
Nicolas Brisebarre和Jean-Michel Muller,以任意精度正确舍入乘法常数,IEEE Transactions on Computers,Vol。第57卷,第2期,2008年2月,第165-174页
b结合这两种技术,并且处理了一些涉及NaN的角落案例,根据IEEE-754 binary32
:
$ b $获得以下 float
b
float my_normpdff(float a)
{
const float RCP_SQRT_2PI_HI = 0x1.988454p-02f; / * 1 / sqrt(2 * pi),msbs * /
const float RCP_SQRT_2PI_LO = -0x1.857936p-27f; / * 1 / sqrt(2 * pi),lsbs * /
float啊,sh,sl,ea;
ah = -0.5f * a;
sh = a * ah;
sl = fmaf(a,ah,0.5f * a * a); / *不要翻转NaN参数的符号位* /
ea = expf(sh);
if(ea!= 0.0f)ea = fmaf(sl,ea,ea); / *避免创建NaN * /
return fmaf(RCP_SQRT_2PI_HI,ea,RCP_SQRT_2PI_LO * ea);
$ b $ double
基于IEEE-754 binary64
的实现看起来几乎完全相同,除了使用的不同的常量值:
double my_normpdf(double a)
{
const double RCP_SQRT_2PI_HI = 0x1.9884533d436510p-02; / * 1 / sqrt(2 * pi),msbs * /
const double RCP_SQRT_2PI_LO = -0x1.cbc0d30ebfd150p-56; / * 1 / sqrt(2 * pi),lbsbs * /
double ah,sh,sl,ea;
ah = -0.5 * a;
sh = a * ah;
sl = fma(a,ah,0.5 * a * a); / *不要翻转NaN参数的符号位* /
ea = exp(sh);
if(ea!= 0.0)ea = fma(sl,ea,ea); / *避免创建NaN * /
返回fma(RCP_SQRT_2PI_HI,ea,RCP_SQRT_2PI_LO * ea);
这些实现的准确性取决于标准数学函数的准确性 expf()
和 exp()
。如果C数学库提供了忠实的四舍五入版本,那么上述两个实现中的任何一个的最大错误通常小于2.5 ulps。
The probability density function of the standard normal distribution is defined as e-x2/2 / √(2π). This can be rendered in straightforward manner into C code. A sample single-precision implementation might be:
float my_normpdff (float a)
{
return 0x1.988454p-2f * my_expf (-0.5f * a * a); /* 1/sqrt(2*pi) */
}
While this code is free from premature underflow, there is an accuracy issue since the error incurred in the computation of a2/2 is magnified by the subsequent exponentiation. One can easily demonstrate this with tests against higher-precision references. The exact error will differ based on the accuracy of the exp()
or expf()
implementations used; for faithfully rounded exponentiation functions one would typically observe a maximum error of around 26 ulps for IEEE-754 binary32
single precision, around 29 ulps for IEEE-754 binary64
double precision.
How can the accuracy issue be addressed in a reasonably efficient manner? A trivial approach would be to employ higher-precision intermediate computation, for example use double
computation for the float
implementation. But this approach does not work for a double
implementation if floating-point arithmetic of higher precision is not easily available, and may be inefficient for float
implementation if double
arithmetic is significantly more expensive than float
computation, e.g. on many GPUs.
解决方案 The accuracy issue raised in the question can effectively, and efficiently, be addressed by the use of limited amounts of double-float
or double-double
computation, facilitated by the use of the fused multiply-add (FMA) operation.
This operation is available since C99
via the standard math functions fmaf(a,b,c)
and fma(a,b,c)
which compute a*b+c, without rounding of the intermediate product. While the functions map directly to fast hardware operations on almost all modern processors, they may use emulation code on older platforms, in which case they may be be very slow.
This allows the computation of the product with twice the normal precision using just two operations, resulting in a head:tail pair of native-precision numbers:
prod_hi = a * b // head
prod_lo = FMA (a, b, -hi) // tail
The high-order bits of the result can be passed to the exponentiation, while the low-order bits are used for improving the accuracy of the result via linear interpolation, taking advantage of the fact that ex is its own derivative:
e = exp (prod_hi) + exp (prod_hi) * prod_lo // exp (a*b)
This allows us to eliminate most of the error of the naive implementation. The other, minor, source of computation error is the limited precision with which the constant 1/√(2π) is represented. This can be improved by using a head:tail representation for the constant that provides twice the native precision, and computing:
r = FMA (const_hi, x, const_lo * x) // const * x
The following paper points out that this technique can even result in correctly-rounded multiplication for some arbitrary-precision constants:
Nicolas Brisebarre and Jean-Michel Muller, "Correctly rounded multiplication by arbitrary precision constants", IEEE Transactions on Computers, Vol. 57, No. 2, February 2008, pp. 165-174
Combining the two techniques, and taking care of a few corner cases involving NaNs, we arrive at the following float
implementation based on IEEE-754 binary32
:
float my_normpdff (float a)
{
const float RCP_SQRT_2PI_HI = 0x1.988454p-02f; /* 1/sqrt(2*pi), msbs */
const float RCP_SQRT_2PI_LO = -0x1.857936p-27f; /* 1/sqrt(2*pi), lsbs */
float ah, sh, sl, ea;
ah = -0.5f * a;
sh = a * ah;
sl = fmaf (a, ah, 0.5f * a * a); /* don't flip "sign bit" of NaN argument */
ea = expf (sh);
if (ea != 0.0f) ea = fmaf (sl, ea, ea); /* avoid creation of NaN */
return fmaf (RCP_SQRT_2PI_HI, ea, RCP_SQRT_2PI_LO * ea);
}
The corresponding double
implementation, based on IEEE-754 binary64
, looks almost identical, except for the different constant values used:
double my_normpdf (double a)
{
const double RCP_SQRT_2PI_HI = 0x1.9884533d436510p-02; /* 1/sqrt(2*pi), msbs */
const double RCP_SQRT_2PI_LO = -0x1.cbc0d30ebfd150p-56; /* 1/sqrt(2*pi), lsbs */
double ah, sh, sl, ea;
ah = -0.5 * a;
sh = a * ah;
sl = fma (a, ah, 0.5 * a * a); /* don't flip "sign bit" of NaN argument */
ea = exp (sh);
if (ea != 0.0) ea = fma (sl, ea, ea); /* avoid creation of NaN */
return fma (RCP_SQRT_2PI_HI, ea, RCP_SQRT_2PI_LO * ea);
}
The accuracy of these implementations depends on the accuracy of the standard math functions expf()
and exp()
, respectively. Where the C math library provides faithfully-rounded versions of those, the maximum error of either of the two implementations above is typically less than 2.5 ulps.
这篇关于用C标准数学库精确计算标准正态分布的PDF的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!