使用 sqrt 和 floor 时的近似误差 [英] Approximation error when using sqrt and floor

查看:59
本文介绍了使用 sqrt 和 floor 时的近似误差的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须枚举一个方程的解,并且我知道 y <x *( sqrt(n) - 1 ),其中 xyn 是整数.

I have to do an enumeration of solutions of an equation and I know that y < x *( sqrt(n) - 1 ), where x, y and n are integers.

我天真的方法是寻找 y 小于或等于 floor( x * ( sqrt( (float)n ) - 1 ) ).

My naive approach would be to look for y less or equal than floor( x * ( sqrt( (float)n ) - 1 ) ).

  • 我应该担心近似误差吗?

  • Should I be worried about approximation error?

例如,如果我的表达式稍大于整数m,我是否应该担心最后得到m-1?

For example, if my expression is a little be greater than an integer m, should I be worried to get m-1 at the end?

我如何检测此类错误?

推荐答案

您绝对应该担心近似误差,但担心的程度取决于 xn<的值范围/em> 你所关心的.

You should definitely be worried about approximation error, but how worried depends upon the ranges of values for x and n you are concerned about.

IEEE 4 字节浮点表示中的计算将有大约 2^23 到 2^24 中的一部分的错误;对于 8 字节表示(即 double),它大约是 2^52 到 2^53 的一部分.您可以预期您需要使用 doubles 而不是 floats 来获得 32 位整数 xn,即使是 double 也不足以处理 64 位整数.

Calculations in IEEE 4-byte floating point representations are going to have errors roughly on the order of one part in 2^23 to 2^24; for an 8-byte representation (i.e, a double), it will be roughly one part in 2^52 to 2^53. You could expect then that you would need to use doubles rather than floats to get an accurate result for 32-bit integers x and n, and that even a double would be insufficient for 64-bit integers.

以代码为例:

template <typename F,typename V>
F approxub(V x,V n) {
    return std::floor(x*std::sqrt(F(n))-x);
}

uint64_t n=1000000002000000000ull; // (10^9 + 1)^2 - 1
uint64_t x=3;
uint64_t y=approxub<double>(x,n);

这给出了 y=3000000000 的值,但正确的值是 2999999999.

This gives a value of y=3000000000, but the correct value is 2999999999.

x 大而 n 小时,情况更糟:在 IEEE double s 中不能完全表示大的 64 位整数:

It's even worse when x is large and n is small: large 64-bit integers are not exactly representable in IEEE doubles:

uint64_t n=9;
uint64_t x=5000000000000001111; // 5e18 + 1111
uint64_t y=approxlb<double>(x,n);

y 的正确值(将 n 何时是完全平方的问题放在一边——在这种情况下,真正的上限将减少 1)是 2 x = 10000000000000002222,即 1e19 + 2222.然而,计算出的 y 是 10000000000000004096.

The correct value for y (putting to one side the issue of when n is a perfect square — the true upper bound will be one less in this case) is 2 x = 10000000000000002222, i.e. 1e19 + 2222. The computed y is, however, 10000000000000004096.

假设您有一个函数 isqrt,它精确地计算了整数平方根的整数部分.那你可以说

Suppose you had a function isqrt which exactly computed the integer part of the square-root of an integer. Then you could say

y = isqrt(x*x*n) - x

并且假设乘积 x*x*n 适合您的整数类型,您将有一个精确的上限(如果 n是一个完美的正方形.)编写 isqrt 函数的方法不止一种;这是一个基于 code codex 资料的示例实现:

and provided that the product x*x*n fit inside your integer type, you would have an exact upper bound (or one more than the upper bound if n is a perfect square.) There's more than one way to write an isqrt function; this is an example implementation based on the material at code codex:

template <typename V>
V isqrt(V v) {
    if (v<0) return 0;

    typedef typename std::make_unsigned<V>::type U;
    U u=v,r=0;

    constexpr int ubits=std::numeric_limits<U>::digits;
    U place=U(1)<<(2*((ubits-1)/2));

    while (place>u) place/=4;
    while (place) {
        if (u>=r+place) {
            u-=r+place;
            r+=2*place;
        }
        r/=2;
        place/=4;
    }
    return (V)r;
}

如果 x 对于这个来说太大了怎么办?例如,如果我们最大的整数类型有 64 位,并且 x 大于 2^32.最直接的解决方案是进行二分查找,将 xr - xxr 作为边界,其中 r = [√n] 是整数平方根.

What if x is too large for this? For example, if our largest integral type has 64 bits, and x is larger than 2^32. The most straightforward solution would be to do a binary search, taking as bounds x r - x and x r, where r = [√n] is the integer square root.

这篇关于使用 sqrt 和 floor 时的近似误差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆