长双(GCC特定)和__float128 [英] long double (GCC specific) and __float128

查看:569
本文介绍了长双(GCC特定)和__float128的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在GCC / x86中寻找有关 long double __ float128 的详细信息(更多的好奇心而不是因为实际的问题)。



很少有人会需要这些东西(我第一次只是真正的 需要一个 double ),但我想知道你在工具箱里有什么以及它是怎么回事仍然值得(也很有趣)。



有鉴于此,请原谅我有些开放性的问题:


  1. 有人可以解释实施的基本原理和预期用法这些类型,也是相互比较的?例如,他们是尴尬实现,因为标准允许类型,并且有人可能会抱怨,如果它们的精度与 double 相同,或者它们打算作为一流的类型?

  2. 或者,有人有一个很好的,可用的web引用来共享吗?在long doublesite:gcc.gnu.org/onlinedocs 上进行的Google搜索并没有给我带来太多的实际帮助。

  3. 假设常用的咒语如果你认为你需要加倍,你可能不理解浮点数不适用,也就是说 需要更高的精度 float ,并且不关心8或16个字节的内存是否被烧毁了......是否有理由期望可以跳到 long double __ float128 而不是 double 没有显着的性能影响? li>
  4. 当CPU在内存和寄存器之间移动时,英特尔CPU的扩展精度功能一直是令人讨厌的惊喜。如果实际存储了96位数据, long double 类型应该可以消除这个问题。另一方面,我知道 long double 类型与 -mfpmath = sse 是互斥的,因为存在没有像SSE中的扩展精确度这样的东西。另一方面,如果使用SSE数学运算(尽管四进制精度指令当然不在1:1指令库上),那么应该使用 __ float128 。在这些假设中,我是否正确?

(3.和4.可以用于分析和反汇编, 部分):

我最初偶然发现 long double ,因为我正在查找 DBL_MAX < float.h> 中,并且在下一行中含有 LDBL_MAX 。 哦,看,海湾合作委员会实际上有128位双打,不是我需要他们,但...很酷是我的第一个想法。惊喜,惊喜: sizeof(long double)返回12 ...等待,你的意思是16?



C和C ++标准毫不奇怪,没有给出一个非常具体的类型定义。 C99(6.2.5 10)表示 double 的数字是 long double 的子集,而C ++ 03状态(3.9.1 8) long double 的精度至少与 double 一样精确(这是同样的事情,只是措辞不同)。基本上,这些标准将所有内容都留给实施,与 long , int 和<$ c相同维基百科说,GCC在x86处理器上使用80位扩展精度,而不管所用的物理存储是多少。



GCC文档全都在同一页上显示,由于i386 ABI,类型的大小为96位,但没有更多任何选项都可以启用超过80位的精度(呃?什么?),奔腾和更新的处理器也希望它们对齐为128位数。这是64位的默认值,可以在32位下手动启用,从而产生32位零填充。



运行测试的时间:

  #include  
#include< cfloat>

int main()
{
#ifdef USE_FLOAT128
typedef __float128 long_double_t;
#else
typedef long double long_double_t;
#endif

long_double_t ld;

int * i =(int *)& ld;
i [0] = i [1] = i [2] = i [3] = 0xdeadbeef; (ld = 0.0000000000000001; ld printf(%08x-%08x-%08x-%08x \r,i [0]为

],i [1],i [2],i [3]);

返回0;
}

使用 long double ,看起来有点像这样,标记的数字是不变的,其他所有数字最终都会随着数字变得越来越大而变化:

  5636666b-c03ef3e0-00223fd8-deadbeef 
^^ ^^^^^^^^

这表明它是不是的80位数字。一个80位数字有18个十六进制数字。我看到22个十六进制数字发生改变,这看起来更像是一个96位数字(24个十六进制数字)。由于 0xdeadbeef 未被触及,它也不是一个128位数字,这与 sizeof 返回12一致。

__ int128 的输出看起来好像只是一个128位的数字。



-m128bit-long-double 编译不是如文档所示,使用32位零填充将 long double 与128位对齐。它并没有使用 __ int128 ,但的确似乎对齐到128位,填充值 0x7ffdd000 (? !)。



此外, LDBL_MAX ,看起来像 + inf long double __ float128 都是c $ c>。从 LDBL_MAX 中加入或减去一个数字,如 1.0E100 1.0E2000 c $ c>导致相同的位模式。

到目前为止,我相信 foo_MAX 常量将保存最大的可表示数那就是 not + inf (显然情况并非如此)。我也不太确定80位数字如何可以作为 + inf 来表示128位的值......也许我最后太累了

解决方案

广告1.



<这些类型被设计用于处理具有巨大动态范围的数字。 long double在x87 FPU中以本地方式实现。我怀疑这个128b的双倍数据将在现代x86上以软件模式实现,因为没有硬件可以在硬件中进行计算。
有趣的是,这很常见在一行中执行许多浮点运算,并且中间结果实际上并不存储在声明的变量中,而是存储在FPU寄存器中,充分利用了精度。这就是为什么比较:

  double x = sin(0); if(x == sin(0))printf(Equal!); 

不安全,无法保证正常工作(无需额外开关)。



广告。 3。

速度取决于您使用的精度。您可以使用以下方式更改FPU的精度:

  void 
set_fpu(unsigned int mode)
{
asm(fldcw%0::m(*& mode));
}

对于较短的变量,速度会更快,速度会变慢。 128位双打可能会在软件中完成,所以速度会更慢。

这不仅是RAM内存浪费,而是缓存被浪费。从64b双倍到80位双倍将浪费33%(32b)到几乎50%(64b)的内存(包括缓存)。

/ p>


另一方面,我知道long double类型与-mfpmath = sse是相互独立的
,因为没有例如在上证所延长
精确度。另一方面,__float128应该在SSE数学中完美地工作
(尽管没有四位精度的
指令当然不在1:1指令库上)。我是否低于
这些假设?

FPU和SSE单位完全分开。您可以使用FPU与SSE同时编写代码。问题是如果你限制它只使用SSE,编译器会产生什么?它会尝试使用FPU吗?我一直在用SSE进行一些编程,GCC将自行生成单个SISD。你必须帮助它使用SIMD版本。 __float128可能适用于所有机器,甚至是8位AVR uC。毕竟,它只是摆弄位。



十六进制表示中的80位实际上是20个十六进制数字。也许不使用的位是来自某些旧操作?在我的机器上,我编译了你的代码,只有20位变成了
模式:66b4e0d2-ec09c1d5-00007ffe-deadbeef


128位版本全部包含位变化。看看 objdump 它看起来好像在使用软件模拟,几乎没有FPU指令。


此外,LDBL_MAX似乎对long double和
__float128都起到+ inf的作用。在LDBL_MAX中添加或减去1.0E100或1.0E2000等数字会产生相同的位模式。到目前为止,我的
认为foo_MAX常数是保持最大
可表示的数字,不是+ inf(显然这不是
情况)。


这似乎很奇怪... ...


我也不太确定一个80位的数字如何可以想象,
充当128位的值的inf ......也许我太累了,最后
这一天,做错了什么。


它可能正在扩展。在80位被识别为+ inf的模式也被转换为128位浮点的+ inf。


I'm looking for detailed information on long double and __float128 in GCC/x86 (more out of curiosity than because of an actual problem).

Few people will probably ever need these (I've just, for the first time ever, truly needed a double), but I guess it is still worthwile (and interesting) to know what you have in your toolbox and what it's about.

In that light, please excuse my somewhat open questions:

  1. Could someone explain the implementation rationale and intended usage of these types, also in comparison of each other? For example, are they "embarrassment implementations" because the standard allows for the type, and someone might complain if they're only just the same precision as double, or are they intended as first-class types?
  2. Alternatively, does someone have a good, usable web reference to share? A Google search on "long double" site:gcc.gnu.org/onlinedocs didn't give me much that's truly useful.
  3. Assuming that the common mantra "if you believe that you need double, you probably don't understand floating point" does not apply, i.e. you really need more precision than just float, and one doesn't care whether 8 or 16 bytes of memory are burnt... is it reasonable to expect that one can as well just jump to long double or __float128 instead of double without a significant performance impact?
  4. The "extended precision" feature of Intel CPUs has historically been source of nasty surprises when values were moved between memory and registers. If actually 96 bits are stored, the long double type should eliminate this issue. On the other hand, I understand that the long double type is mutually exclusive with -mfpmath=sse, as there is no such thing as "extended precision" in SSE. __float128, on the other hand, should work just perfectly fine with SSE math (though in absence of quad precision instructions certainly not on a 1:1 instruction base). Am I right in these assumptions?

(3. and 4. can probably be figured out with some work spent on profiling and disassembling, but maybe someone else had the same thought previously and has already done that work.)

Background (this is the TL;DR part):
I initially stumbled over long double because I was looking up DBL_MAX in <float.h>, and incidentially LDBL_MAX is on the next line. "Oh look, GCC actually has 128 bit doubles, not that I need them, but... cool" was my first thought. Surprise, surprise: sizeof(long double) returns 12... wait, you mean 16?

The C and C++ standards unsurprisingly do not give a very concrete definition of the type. C99 (6.2.5 10) says that the numbers of double are a subset of long double whereas C++03 states (3.9.1 8) that long double has at least as much precision as double (which is the same thing, only worded differently). Basically, the standards leave everything to the implementation, in the same manner as with long, int, and short.

Wikipedia says that GCC uses "80-bit extended precision on x86 processors regardless of the physical storage used".

The GCC documentation states, all on the same page, that the size of the type is 96 bits because of the i386 ABI, but no more than 80 bits of precision are enabled by any option (huh? what?), also Pentium and newer processors want them being aligned as 128 bit numbers. This is the default under 64 bits and can be manually enabled under 32 bits, resulting in 32 bits of zero padding.

Time to run a test:

#include <stdio.h>
#include <cfloat>

int main()
{
#ifdef  USE_FLOAT128
    typedef __float128  long_double_t;
#else
    typedef long double long_double_t;
#endif

long_double_t ld;

int* i = (int*) &ld;
i[0] = i[1] = i[2] = i[3] = 0xdeadbeef;

for(ld = 0.0000000000000001; ld < LDBL_MAX; ld *= 1.0000001)
    printf("%08x-%08x-%08x-%08x\r", i[0], i[1], i[2], i[3]);

return 0;
}

The output, when using long double, looks somewhat like this, with the marked digits being constant, and all others eventually changing as the numbers get bigger and bigger:

5636666b-c03ef3e0-00223fd8-deadbeef
                  ^^       ^^^^^^^^

This suggests that it is not an 80 bit number. An 80-bit number has 18 hex digits. I see 22 hex digits changing, which looks much more like a 96 bits number (24 hex digits). It also isn't a 128 bit number since 0xdeadbeef isn't touched, which is consistent with sizeof returning 12.

The output for __int128 looks like it's really just a 128 bit number. All bits eventually flip.

Compiling with -m128bit-long-double does not align long double to 128 bits with a 32-bit zero padding, as indicated by the documentation. It doesn't use __int128 either, but indeed seems to align to 128 bits, padding with the value 0x7ffdd000(?!).

Further, LDBL_MAX, seems to work as +inf for both long double and __float128. Adding or subtracting a number like 1.0E100 or 1.0E2000 to/from LDBL_MAX results in the same bit pattern.
Up to now, it was my belief that the foo_MAX constants were to hold the largest representable number that is not +inf (apparently that isn't the case?). I'm also not quite sure how an 80-bit number could conceivably act as +inf for a 128 bit value... maybe I'm just too tired at the end of the day and have done something wrong.

解决方案

Ad 1.

Those types are designed to work with numbers with huge dynamic range. The long double is implemented in a native way in the x87 FPU. The 128b double I suspect would be implemented in software mode on modern x86s, as there's no hardware to do the computations in hardware.

The funny thing is that it's quite common to do many floating point operations in a row and the intermediate results are not actually stored in declared variables but rather stored in FPU registers taking advantage of full precision. That's why comparison:

double x = sin(0); if (x == sin(0)) printf("Equal!");

Is not safe and cannot be guaranteed to work (without additional switches).

Ad. 3.

There's an impact on the speed depending what precision you use. You can change used the precision of the FPU by using:

void 
set_fpu (unsigned int mode)
{
  asm ("fldcw %0" : : "m" (*&mode));
}

It will be faster for shorter variables, slower for longer. 128bit doubles will be probably done in software so will be much slower.

It's not only about RAM memory wasted, it's about cache being wasted. Going to 80 bit double from 64b double will waste from 33% (32b) to almost 50% (64b) of the memory (including cache).

Ad 4.

On the other hand, I understand that the long double type is mutually exclusive with -mfpmath=sse, as there is no such thing as "extended precision" in SSE. __float128, on the other hand, should work just perfectly fine with SSE math (though in absence of quad precision instructions certainly not on a 1:1 instruction base). Am I right under these assumptions?

The FPU and SSE units are totally separate. You can write code using FPU at the same time as SSE. The question is what will the compiler generate if you constrain it to use only SSE? Will it try to use FPU anyway? I've been doing some programming with SSE and GCC will generate only single SISD on its own. You have to help it to use SIMD versions. __float128 will probably work on every machine, even the 8-bit AVR uC. It's just fiddling with bits after all.

The 80 bit in hex representation is actually 20 hex digits. Maybe the bits which are not used are from some old operation? On my machine, I compiled your code and only 20 bits change in long mode: 66b4e0d2-ec09c1d5-00007ffe-deadbeef

The 128-bit version has all the bits changing. Looking at the objdump it looks as if it was using software emulation, there are almost no FPU instructions.

Further, LDBL_MAX, seems to work as +inf for both long double and __float128. Adding or subtracting a number like 1.0E100 or 1.0E2000 to/from LDBL_MAX results in the same bit pattern. Up to now, it was my belief that the foo_MAX constants were to hold the largest representable number that is not +inf (apparently that isn't the case?).

This seems to be strange...

I'm also not quite sure how an 80-bit number could conceivably act as +inf for a 128-bit value... maybe I'm just too tired at the end of the day and have done something wrong.

It's probably being extended. The pattern which is recognized to be +inf in 80-bit is translated to +inf in 128-bit float too.

这篇关于长双(GCC特定)和__float128的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆