计算一个正确舍入/几乎是正确舍入的浮点立方根 [英] Computing a correctly rounded / an almost correctly rounded floating-point cubic root

查看：182 发布时间：2015/11/30 14:35:56 algorithm floating-point ieee-754

本文介绍了计算一个正确舍入/几乎是正确舍入的浮点立方根的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设正确舍如发现 CRlibm 可用标准库函数。那么如何将一个计算双precision输入正确舍立方根？</ P>

^{这个问题不是引用的一个常见问题解答的[I]面对的实际问题。它像功课有点这样。但立方根是一个经常发现操作，人们可以想像这个问题是一个实际的问题，有人的面孔。}

由于最好的堆栈溢出的问题，有一点源$ C $ c。在他们，这里是一个有点源$ C $ C：

  Y =战俘（X，1/3）;

以上不计算一个正确舍立方根，因为1/3不重presentable正是因为一个双。

其他注意事项：

这是文章介绍如何计算浮点立方根，但最后一次迭代（ S中的推荐的牛顿 - 拉夫逊算法）将必须做更高precision的算法来计算正确的圆形双precision立方根。这可能是计算它的最好办法，但我现在还在找工作，将利用现有正确舍入的规范功能优势的捷径。

C99包括 CBRT（）的功能，但它不能被预期将正确舍入的甚至忠实为所有的编译器。 CRlibm的设计师也可以选择将包括在所提供的功能列表 CBRT（），但他们没有这样做。引用的正确舍入的数学函数库等可实现的欢迎。

解决方案

我很害怕，我不知道该如何保证正确的舍入双precision立方根，但可以提供一个非常接近正确四舍五入也提到了这个问题。换句话说，出现最大误差是非常接近0.5 ULP

彼得·马克斯坦，IA-64和基本功能：速度和precision（prentice霍尔2000）的

presents高效FMA为基础的技术的正确四舍五入倒数，平方根，和平方根倒数，但它不包括在这方面的立方根。一般来说马克斯坦的方法需要preliminary结果是在1 ULP 之前准确最后四舍五入序列。我没有数学资金来他的技术扩展到立方根的舍入，但在我看来，这应该是可能的原则，是一个挑战，有点类似于倒数平方根。

逐位算法容易借给自己的根与正确的四舍五入计算。由于领带例为IEEE-754舍入模式最接近，或者，甚至不会发生，人们只是需要进行计算，直到它已生产的所有尾数位加一圆位。逐位算法平方根，基于二项式定理，是公知的在两个非还原和还原的变体和一直是硬件实现的基础。通过二项式定理同样的方法适用于立方根，有一个鲜为人知的纸勾画出的非恢复执行的细节：

小时。鹏算法提取平方根和立方根，程序上的计算机算法，第121-126，1981年第五届IEEE国际研讨会。的

最好的我可以告诉它尝试这个作品不够好，立方根从整数提取。因为每次迭代仅产生单个结果数位是不完全快。用于浮点运算的应用程序具有用一对夫妇，需要大致的最终结果的比特数的两倍簿记变量的缺点。这意味着人们会需要使用128位的整数运算来实现双precision立方根。

我的C99 code以下是基于哈雷的理性方法立方根的具有立方收敛意味着初始近似值不必很准确的有效数字的数目在每一次迭代的三倍。计算可安排以不同的方式。一般来说，在数字上有利的安排迭代格式为

new_guess：= old_guess +校

因为对于一个充分接近最初的猜测，修正比 old_guess 显著较小。这导致的立方根以下迭代计划：

X：= X - X *（X ³ - 一个）/（2 * X ³ + y）的

这特别的安排也被列在 Kahan的关于立方根笔记。它有自然的贷款本身使用 FMA（融合的乘加）。一个缺点是，2 * x的计算³可能导致溢出，因此一个参数缩减方案所需的至少部分双链precision输入域。在我的code我简单地套用参数减少为所有非特殊的投入的基础上，简单明了的操控IEEE-754双precision操作数的代表人物。

的间隔[0.125,1）作为主要的近似时间间隔。多项式极小极大近似时，返回的[0.5,1]的初始猜测。在狭窄的范围内方便使用的单precision运算的计算的低精度的部分。

我无法证明对我的执行误差界任何事情，然而，随着对一个参考实现200多万元的随机测试向量（精确到约200位）测试发现超过一半的ulp没有错误，这表明最大误差必须非常接近0.5 ULP。

双my_cbrt（双一）
{
    双B，U，V，R;
    浮动BB，UU，VV;
    INT E，F，S;

    如果（（A == 0.0）|| isinf（一）|| isnan（一））{
        / *处理特殊情况* /
        R = A + A;
    } 其他 {
        / *去掉符号位* /
        B =晶圆厂（一）;
        / *计算指数调整* /
        B = frexp（B，急症室）;
        S =ë -  3 * 342;
        F = S / 3;
        S =秒 -  3 * F;
        F = F + 342;
        / *映射参数到主逼近区间[0.125,1）* /
        B = ldexp（B，S）;
        BB =（浮点）B：
        / *在[0.125,1近似立方体根）与相对误差5.22e-3 * /
        UU = 0x1.2f32c0p-1F;
        UU = fmaf（UU，BB，-0x1.62cc2ap + 0F）;
        UU = fmaf（UU，BB，0x1.7546e0p + 0F）;
        UU = fmaf（UU，BB，0x1.5d0590p-2F）;
        / *使用两个哈雷迭代W¯¯细化立方根/立方收敛* /
        VV = UU * UU;
        UU = fmaf（fmaf（VV，UU，-bb）/ fmaf（VV，2.0f * UU，BB），-uu，UU）;
        U =（双）UU;
        V = U * U; //这个产品是精确的
        R = FMA（FMA（V，U，-b）/ FMA（V，2.0 * U，B），-u，U）;
        / *从初级逼近区间映射回干扰指数* /
        R = ldexp（R，F）;
        / *恢复符号位* /
        R =复制符号（R，A）;
    }
    返回ř;
}

Suppose that correctly rounded standard library functions such as found in CRlibm are available. Then how would one compute the correctly rounded cubic root of a double-precision input?

^{This question is not an "actual problem that [I] face", to quote the FAQ. It is a little bit like homework this way. But the cubic root is a frequently found operation and one could imagine this question being an actual problem that someone faces.}

Since "best Stack Overflow questions have a bit of source code in them", here is a bit of source code:

  y = pow(x, 1. / 3.);

The above does not compute a correctly rounded cubic root because 1/3 is not representable exactly as a double.

ADDITIONAL NOTES:

An article describes how to compute a floating-point cubic root, but the last iteration(s) of the recommended Newton-Raphson algorithm would have to be done to higher precision for the algorithm to compute a correctly rounded double-precision cubic root. That may be the best way to compute it, but I am still looking for a shortcut that would take advantage of existing correctly rounded standardized functions.

C99 includes a cbrt() function, but it cannot be expected to be correctly rounded or even faithful for all compilers. CRlibm's designers could have chosen to include cbrt() in the list of provided functions, but they didn't. References to implementations available in other libraries of correctly rounded math functions are welcome.

解决方案

I am afraid I do not know how to guarantee a correctly rounded double-precision cube root, but can offer one that is very nearly correctly rounded as also mentioned in the question. In other words, the maximum error appears to be very close to 0.5 ulp.

Peter Markstein, "IA-64 and Elementary Functions: Speed and Precision" (Prentice-Hall 2000)

presents efficient FMA-based techniques for correctly rounding reciprocal, square root, and reciprocal square root, but it does not cover the cube root in this regard. In general Markstein's approach requires a preliminary result that is accurate to within 1 ulp prior to the final rounding sequence. I do not have the mathematical wherewithal to extend his technique to the rounding of cube roots, but it seems to me that this should be possible in principle, being a challenge somewhat similar to the reciprocal square root.

Bit-wise algorithms lend themselves easily to the computation of roots with correct rounding. Since tie cases for the IEEE-754 rounding mode to-nearest-or-even cannot occur, one simply needs to carry the computation until it has produced all mantissa bits plus one round bit. The bit-wise algorithm for square root, based on the binomial theorem, is well known in both non-restoring and restoring variants and has been the basis of hardware implementations. The same approach via binomial theorem works for the cube root, and there is a little-known paper that lays out the details of a non-restoring implementation:

H. Peng, "Algorithms for Extracting Square Roots and Cube Roots," Proceedings 5th IEEE International Symposium on Computer Arithmetic, pp. 121-126, 1981.

Best I can tell from experimenting with it this works well enough for the extraction of cube roots from integers. Since each iteration produces only a single result bit it is not exactly fast. For applications in floating-point arithmetic it has the drawback of using a couple of bookkeeping variables that require roughly twice the number of bits of the final result. This means one would need to use 128-bit integer arithmetic to implement a double-precision cube root.

My C99 code below is based on Halley's rational method for the cube root which has cubic convergence meaning that the initial approximation does not have to be very accurate as the number of valid digits triples in every iteration. The computation can be arranged in various ways. Generally it is numerically advantageous to arrange iterative schemes as

new_guess := old_guess + correction

since for a sufficiently close initial guess, correction is significantly smaller than old_guess. This leads to the following iteration scheme for the cube root:

x := x - x * (x³ - a) / (2*x³ + y)

This particular arrangement is also listed in Kahan's notes on cube root. It has the further advantage of lending itself naturally to the use of FMA (fused-multiply add). One drawback is that the computation of 2*x³ could lead to overflow, therefore an argument reduction scheme is required for at least part of the double-precision input domain. In my code I simply apply the argument reduction to all non-exceptional inputs, based on straightforward manipulation of the exponents of IEEE-754 double-precision operands.

The interval [0.125,1) serves as the primary approximation interval. A polynomial minimax approximation is used that returns an initial guess in [0.5,1]. The narrow range facilitates the use of single-precision arithmetic for the low-accuracy portions of the computation.

I cannot prove anything about the error bounds of my implementation, however, testing with more than 200 million random test vectors against a reference implementation (accurate to about 200 bits) detected no errors in excess of half an ulp, suggesting that the maximum error must be very close to 0.5 ulp.

double my_cbrt (double a)
{
    double b, u, v, r;
    float bb, uu, vv;
    int e, f, s;

    if ((a == 0.0) || isinf(a) || isnan(a)) {
        /* handle special cases */
        r = a + a;
    } else {
        /* strip off sign-bit */
        b = fabs (a);
        /* compute exponent adjustments */
        b = frexp (b, &e);
        s = e - 3*342;
        f = s / 3;
        s = s - 3 * f;
        f = f + 342;
        /* map argument into the primary approximation interval [0.125,1) */
        b = ldexp (b, s);
        bb = (float)b;
        /* approximate cube root in [0.125,1) with relative error 5.22e-3 */
        uu =                0x1.2f32c0p-1f;
        uu = fmaf (uu, bb, -0x1.62cc2ap+0f);
        uu = fmaf (uu, bb,  0x1.7546e0p+0f);
        uu = fmaf (uu, bb,  0x1.5d0590p-2f);
        /* refine cube root using two Halley iterations w/ cubic convergence */
        vv = uu * uu;
        uu = fmaf (fmaf (vv, uu, -bb) / fmaf (vv, 2.0f*uu, bb), -uu, uu);
        u = (double)uu;
        v = u * u; // this product is exact
        r = fma (fma (v, u, -b) / fma (v, 2.0*u, b), -u, u);
        /* map back from primary approximation interval by jamming exponent */
        r = ldexp (r, f);
        /* restore sign bit */
        r = copysign (r, a);
    }
    return r;
}

这篇关于计算一个正确舍入/几乎是正确舍入的浮点立方根的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

计算一个正确舍入/几乎是正确舍入的浮点立方根 [英] Computing a correctly rounded / an almost correctly rounded floating-point cubic root

问题描述

相关文章

C/C++最新文章

热门教程

热门工具

登录关闭

计算一个正确舍入/几乎是正确舍入的浮点立方根 [英] Computing a correctly rounded / an almost correctly rounded floating-point cubic root

问题描述

相关文章

C/C++最新文章

热门教程

热门工具

登录 关闭

登录关闭