int到float转换的精度损失 [英] Loss of precision for int to float conversion

查看：267 发布时间：2021/5/8 19:58:50 c++ floating-point type-conversion

本文介绍了int到float转换的精度损失的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在C ++中，将精确地将类型为 I 的整数值转换为浮点类型为 F 的—.作为 static_cast< I>(static_cast< F>(i))== i —如果 I 的范围是 F 的整数值范围的一部分.

In C++, the conversion of an integer value of type I to a floating point type F will be exact — as static_cast<I>(static_cast<F>(i)) == i — if the range of I is a part of the range of integral values of F.

是否可以(如果使用的话)如何计算 static_cast< F>(i)的精度损失(不使用范围更广的另一种浮点类型)?

Is it possible, and if yes how, to calculate the loss of precision of static_cast<F>(i) (without using another floating point type with a wider range)?

首先，我尝试编写一个函数，该函数将在转换是否安全(安全，即不损失精度)的情况下返回，但是我必须承认我不太确定其正确性.

As a start, I tried to code a function that would return if a conversion is safe or not (safe, meaning no loss of precision), but I must admit I am not so sure about its correctness.

template <class F, class I>
bool is_cast_safe(I value)
{
    return std::abs(alue) < std::numeric_limits<F>::digits;
}

std::cout << is_cast_safe<float>(4) << std::endl; // true
std::cout << is_cast_safe<float>(0x1000001) << std::endl; // false

谢谢.

推荐答案

is_cast_safe 可以通过以下方式实现:

is_cast_safe can be implemented with:

static const F One = 1;
F ULP = std::scalbn(One, std::ilogb(value) - std::numeric_limits<F>::digits + 1);
I U = std::max(ULP, One);
return value % U;

由于将 value 转换为 F 的结果，因此将 ULP 设置为最低位数的值. ilogb 返回最高位数的位置(作为浮点基数的指数)，然后减去位数再减去一位就调整到最低位数.然后 scalbn 为我们提供该位置的值，即ULP.

This sets ULP to the value of the least digit position in the result of converting value to F. ilogb returns the position (as an exponent of the floating-point radix) for the highest digit position, and subtracting one less than the number of digits adjusts to the lowest digit position. Then scalbn gives us the value of that position, which is the ULP.

然后，当且仅当它是ULP的倍数时， value 才能准确地用 F 表示.为了测试这一点，我们将ULP转换为 I (如果小于1，则替换为1)，然后将 value 的其余部分除以ULP(或1).).

Then value can be represented exactly in F if and only if it is a multiple of the ULP. To test that, we convert the ULP to I (but substitute 1 if it is less than 1), and then take the remainder of value divided by the ULP (or 1).

此外，如果有人担心到 F 的转换可能会溢出，则也可以插入代码来处理此问题.

Also, if one is concerned the conversion to F might overflow, code can be inserted to handle this as well.

计算更改的实际金额比较麻烦.转换为浮点数可以向上或向下取整，并且选择的规则是由实现定义的，尽管舍入到最接近的关系到偶数是很常见的.因此，不能从 numeric_limits 中给出的浮点属性中计算出实际的变化.它必须涉及执行转换并以浮点数形式进行一些工作.绝对可以做到这一点，但这很麻烦.我认为一种可行的方法是:

Calculating the actual amount of the change is trickier. The conversion to floating-point could round up or down, and the rule for choosing is implementation-defined, although round-to-nearest-ties-to-even is common. So the actual change cannot be calculated from the floating-point properties we are given in numeric_limits. It must involve performing the conversion and doing some work in floating-point. This definitely can be done, but it is a nuisance. I think an approach that should work is:

假设 value 为非负数.(负值可以类似地处理，但为简单起见，现在省略.)
首先，测试转换为 F 时是否溢出.这本身就很棘手，因为如果值太大，则行为是不确定的.此答案解决了一些类似的考虑，有关安全地将浮点数转换为整数(用C语言表示)的问题./li>
如果该值没有溢出，则将其转换.令结果为 x .将 x 除以浮点基数 r ，得到 y .如果 y 不是整数(可以使用 fmod 或 trunc 进行测试)，则转换是准确的.
否则，将 y 转换为 I ，生成 z .这是安全的，因为 y 小于原始的 value ，因此它必须适合 I .
然后，由于转换导致的错误是(z-value/r)* r + value％r .

Assume value is non-negative. (Negative values can be handled similarly but are omitted for now for simplicity.)
First, test for overflow in conversion to F. This in itself is tricky, as the behavior is undefined if the value is too large. Some similar considerations were addressed in this answer to a question about safely converting from floating-point to integer (in C).
If the value does not overflow, then convert it. Let the result be x. Divide x by the floating-point radix r, producing y. If y is not an integer (which can be tested using fmod or trunc) the conversion was exact.
Otherwise, convert y to I, producing z. This is safe because y is less than the original value, so it must fit in I.
Then the error due to conversion is (z-value/r)*r + value%r.

这篇关于int到float转换的精度损失的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

int到float转换的精度损失 [英] Loss of precision for int to float conversion

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

int到float转换的精度损失 [英] Loss of precision for int to float conversion

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭