可靠的溢出检测浮点/整数类型转换 [英] Reliable overflow detection of floating-point/integer type conversion

查看:130
本文介绍了可靠的溢出检测浮点/整数类型转换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有一种安全的方法可靠地确定整数类型 T 是否可以存储浮点整数值 f (所以 f == floor(f))没有任何溢出?

Is there a safe way to reliably determine if an integral type T can store a floating-point integer value f (so f == floor(f)) without any overflow?

请记住,没有保证浮点类型 F 与IEC 559(IEEE 754)兼容,并且有符号整数溢出在C ++中是未定义的行为。我对根据当前的C ++(写作时的C ++ 17)标准是正确的解决方案感兴趣,并避免未定义的行为

Keep in mind that there is no guarantee that the floating point type F is IEC 559 (IEEE 754) compatible, and that signed integer overflow is undefined behavior in C++. I'm interested in a solution which is correct according to the current C++ (C++17 at the writing) standard and avoids undefined behavior.

以下天真方法不可靠,因为无法保证类型 F 可以表示 std :: numeric_limits< I> :: max( )由于浮点舍入。

The following naive approach is not reliable, since there is no guarantee that type F can represent std::numeric_limits<I>::max() due to floating-point rounding.

#include <cmath>
#include <limits>
#include <type_traits>

template <typename I, typename F>
bool is_safe_conversion(F x)
{
    static_assert(std::is_floating_point_v<F>);
    static_assert(std::is_integral_v<I>);

    // 'fmax' may have a different value than expected
    static constexpr F fmax = static_cast<F>(std::numeric_limits<I>::max());

    return std::abs(x) <= fmax; // this test may gives incorrect results
}

有什么想法吗?

推荐答案


有没有一种安全的方法来可靠地确定整数类型T是否可以存储浮点整数值f?

Is there a safe way to reliably determine if an integral type T can store a floating-point integer value f?

是的。关键是测试 f 是否在 T :: MIN - 0.999 ... 到<$ c的范围内$ c> T :: MAX + 0.999 ... 使用浮点数学 - 没有舍入问题。奖励:舍入模式不适用。

Yes. The key is to test if f is in the range T::MIN - 0.999... to T::MAX + 0.999... using floating point math - with no rounding issues. Bonus: rounding mode does not apply.

有3条失败路径:太大,太小,不是数字。

There are 3 failure paths: too big, too small, not-a-number.


以下假定 int / double 。我将为OP留下C ++模板。

The below assumes int/double. I'll leave the C++ template forming for OP.

形成精确的 T :: MAX + 1 完全使用浮点数学很容易,因为 INT_MAX Mersenne Number 。 (我们不是在这里谈论 Mersenne Prime 。)

Forming exact T::MAX + 1 exactly using floating point math is easy as INT_MAX is a Mersenne Number. (We are not talking about Mersenne Prime here.)

代码利用:

A Mersenne数字除以2,整数数学也是 Mersenne数

整数类型的2次幂常量到浮点类型的转换可以是肯定是完全

Code takes advantage of:
A Mersenne Number divided by 2 with integer math is also a Mersenne Number.
The conversion of a integer type power-of-2 constant to a floating point type can be certain to be exact.

#define DBL_INT_MAXP1 (2.0*(INT_MAX/2+1)) 
// Below needed when -INT_MAX == INT_MIN
#define DBL_INT_MINM1 (2.0*(INT_MIN/2-1)) 

成形确切 T :: MIN - 1 很难,因为它的绝对值通常是2 + 1的幂,并且整数类型和FP类型的相对精度不是某些。相反,代码可以减去2的精确幂并与-1进行比较。

Forming exact T::MIN - 1 is hard as its absolute value is usually a power-of-2 + 1 and the relative precision of the integer type and the FP type are not certain. Instead code can subtract the exact power of 2 and compare to -1.

int double_to_int(double x) {
  if (x < DBL_INT_MAXP1) {
    #if -INT_MAX == INT_MIN
    // rare non-2's complement machine 
    if (x > DBL_INT_MINM1) {
      return (int) x;
    }
    #else
    if (x - INT_MIN > -1.0) {
      return (int) x;
    }
    #endif 
    Handle_Underflow();
  } else if (x > 0) {
    Handle_Overflow();
  } else {
    Handle_NaN();
  }
}






关于非二进制基数的浮点类型( FLT_RADIX!= 2


Regarding floating-point types with non-binary radix (FLT_RADIX != 2)

使用 FLT_RADIX = 4,8,16 ...... ,转换也是准确的。使用 FLT_RADIX == 10 ,代码至少精确到34位 int double 必须完全编码+/- 10 ^ 10。所以问题是说 FLT_RADIX == 10 ,64位 int 机器 - 风险很低。基于内存,生产中的最后一个 FLT_RADIX == 10 是十多年前的。

With FLT_RADIX = 4, 8, 16 ..., the conversion would be exact too. With FLT_RADIX == 10, code is at least exact up to a 34-bit int as a double must encode +/-10^10 exactly. So a problem with say a FLT_RADIX == 10, 64-bit int machine - a low risk. Based on memory, the last FLT_RADIX == 10 in production was over a decade ago.

整数类型是始终编​​码为2的补码(最常见),1s补码或符号幅度。 INT_MAX 始终是power-2-minus-1。 INT_MIN 总是a-power-2或1。实际上,总是以2为基础。

The integer type is always encoded as 2's complement (most common), 1s' complement, or sign magnitude. INT_MAX is always a power-2-minus-1. INT_MIN is always a - power-2 or 1 more. Effectively, always base 2.

这篇关于可靠的溢出检测浮点/整数类型转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆