为何编译器将浮点数的位数固定为6? [英] Why do compilers fix the digits of floating point number to 6?

查看:116
本文介绍了为何编译器将浮点数的位数固定为6?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据 C ++编程语言-第4 第6.2.5节


共有三种浮点类型: float(单精度) ,double(双精度)和long double(扩展精度)

There are three floating-points types: float (single-precision), double (double-precision), and long double (extended-precision)

请参阅: http://en.wikipedia.org/wiki/Single-precision_floating-point_format


真正的有效数字包括二进制点右边的23个小数位和值为1的隐式前导位(二进制点的左边),除非指数存储为全零。 因此,仅23个有效位的小数位出现在内存格式中,但总精度为24位(相当于log10(224)≈7.225十进制数字)。

The true significand includes 23 fraction bits to the right of the binary point and an implicit leading bit (to the left of the binary point) with value 1 unless the exponent is stored with all zeros. Thus only 23 fraction bits of the significand appear in the memory format but the total precision is 24 bits (equivalent to log10(224) ≈ 7.225 decimal digits).

→在 binary32交换格式上,浮点数的最大位数为 7 位。 (一种计算机数字格式,在计算机内存中占用4个字节(32位))

→ The maximum digits of floating point number is 7 digits on binary32 interchange format. (a computer number format that occupies 4 bytes (32 bits) in computer memory)

不同的编译器(例如GCC,VC编译器)

→它始终输出 6 作为值。

When I test on different compilers (like GCC, VC compiler)
→ It always outputs 6 as the value.

看看每个编译器的 float.h

→我发现 6 固定的

Take a look into float.h of each compiler
→ I found that 6 is fixed.

问题:


  • 您知道为什么吗这里有一个不同之处(实际值理论值-7-和实际值-6之间)?

    听起来像 7 合理,因为当我使用以下代码测试时,该值仍然有效,而 8 无效


  • 为什么 编译器不检查 交换格式来决定浮点数(而不是使用固定值)所代表的位数?
  • Do you know why there is a different here (between actual value theoretical value - 7 - and actual value - 6)?
    It sounds like "7" is more reasonable because when I test using below code, the value is still valid, while "8" is invalid
  • Why don't the compilers check the interchange format for giving decision about the numbers of digits represented in floating-point (instead of using a fixed value)?

代码:

Code:

#include <iostream> 
#include <limits>

using namespace std;

int main( )
{
    cout << numeric_limits<float> :: digits10 << endl;

    float f = -9999999;

    cout.precision ( 10 );

    cout << f << endl;
}


推荐答案

您不是阅读文档


std :: numeric_limits< T> :: digits10 的值是基数- T 表示的> 10 个数字,即不带任何十进制数字的任何数字都可以转换为一个值类型为 T 并返回十进制形式,由于四舍五入或溢出而没有变化。对于基数基数类型,它是数字值(对于浮点类型,为 digits-1 )乘以 log 10 (基数)并四舍五入。

The value of std::numeric_limits<T>::digits10 is the number of base-10 digits that can be represented by the type T without change, that is, any number with this many decimal digits can be converted to a value of type T and back to decimal form, without change due to rounding or overflow. For base-radix types, it is the value of digits (digits-1 for floating-point types) multiplied by log10(radix) and rounded down.

标准的32位IEEE 754浮点类型具有24位小数部分(写入23位,暗含一个),可能表明它可以表示7位小数( 24 * std :: log10(2)为7.22),但相对舍入错误是不一致的,某些具有7个十进制数字的浮点值无法在转换为32位浮点数和返回值的过程中幸存下来:最小的正例为 8.589973e9 ,则往返后将变为 8.589974e9 这些舍入误差不能超过表示中的一位,并且digits10的计算方式为(24-1)* std :: log10(2),即6.92。四舍五入得出值6。

The standard 32-bit IEEE 754 floating-point type has a 24 bit fractional part (23 bits written, one implied), which may suggest that it can represent 7 digit decimals (24 * std::log10(2) is 7.22), but relative rounding errors are non-uniform and some floating-point values with 7 decimal digits do not survive conversion to 32-bit float and back: the smallest positive example is 8.589973e9, which becomes 8.589974e9 after the roundtrip. These rounding errors cannot exceed one bit in the representation, and digits10 is calculated as (24-1)*std::log10(2), which is 6.92. Rounding down results in the value 6.






std :: numeric_limits< float> :: max_digits10 是9:




std::numeric_limits<float>::max_digits10 is 9:


std :: numeric_limits< T> :: max_digits10 的值是唯一表示 T 类型的所有不同值所必需的基数- 10 的位数对文本进行序列化/反序列化所必需。该常数对所有浮点类型都有意义。

The value of std::numeric_limits<T>::max_digits10 is the number of base-10 digits that are necessary to uniquely represent all distinct values of the type T, such as necessary for serialization/deserialization to text. This constant is meaningful for all floating-point types.

与大多数数学运算不同,将浮点值转换为文本并返回的精确时间至少要长于使用了 max_digits10 float ,<$ c的 9 $ c> 17 表示 double ):即使中间文本表示形式不准确,也可以保证产生相同的浮点值。可能需要超过100个十进制数字才能以十进制表示法表示浮点数的精确值。

Unlike most mathematical operations, the conversion of a floating-point value to text and back is exact as long as at least max_digits10 were used (9 for float, 17 for double): it is guaranteed to produce the same floating-point value, even though the intermediate text representation is not exact. It may take over a hundred decimal digits to represent the precise value of a float in decimal notation.

这篇关于为何编译器将浮点数的位数固定为6?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆