long double(特定于 GCC)和 float128 [英] long double (GCC specific) and float128

查看：36 发布时间：2021/12/18 23:20:43 gcc long-double

本文介绍了long double(特定于 GCC)和 __float128的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在 GCC/x86 中寻找关于 long double 和 __float128 的详细信息(更多是出于好奇而不是因为实际问题).

I'm looking for detailed information on long double and __float128 in GCC/x86 (more out of curiosity than because of an actual problem).

可能很少有人会需要这些(我第一次真的需要double)，但我想它仍然值得(和有趣)了解您的工具箱中有什么以及它是关于什么的.

Few people will probably ever need these (I've just, for the first time ever, truly needed a double), but I guess it is still worthwile (and interesting) to know what you have in your toolbox and what it's about.

有鉴于此，请原谅我有些悬而未决的问题:

In that light, please excuse my somewhat open questions:

有人能解释一下这些类型的实现原理和预期用途吗，也可以相互比较一下?例如，它们是尴尬的实现"，因为标准允许该类型，如果它们只是与 double 的精度相同，有人可能会抱怨，或者它们是否打算作为一流的类型?
或者，有人可以分享一个好的、可用的网络参考吗?在long double"站点:gcc.gnu.org/onlinedocs 上的 Google 搜索并没有给我太多真正有用的信息.
假设常见的口头禅如果你认为你需要双精度，你可能不理解浮点数"并不适用，也就是说你真的需要更高的精度不只是 float，而且人们并不关心是 8 字节还是 16 字节的内存被烧毁……期望人们也可以跳到 long double 或 __float128 而不是 double 对性能没有显着影响?
Intel CPU 的扩展精度"功能在历史上一直是值在内存和寄存器之间移动时令人讨厌的意外来源.如果实际存储了 96 位，long double 类型应该可以消除这个问题.另一方面，我知道 long double 类型与 -mfpmath=sse 是互斥的，因为在 SSE 中没有扩展精度"这样的东西.另一方面，__float128 应该与 SSE 数学完美配合(尽管在没有四精度指令的情况下肯定不是在 1:1 指令基础上).我的这些假设是否正确?

Could someone explain the implementation rationale and intended usage of these types, also in comparison of each other? For example, are they "embarrassment implementations" because the standard allows for the type, and someone might complain if they're only just the same precision as double, or are they intended as first-class types?
Alternatively, does someone have a good, usable web reference to share? A Google search on "long double" site:gcc.gnu.org/onlinedocs didn't give me much that's truly useful.
Assuming that the common mantra "if you believe that you need double, you probably don't understand floating point" does not apply, i.e. you really need more precision than just float, and one doesn't care whether 8 or 16 bytes of memory are burnt... is it reasonable to expect that one can as well just jump to long double or __float128 instead of double without a significant performance impact?
The "extended precision" feature of Intel CPUs has historically been source of nasty surprises when values were moved between memory and registers. If actually 96 bits are stored, the long double type should eliminate this issue. On the other hand, I understand that the long double type is mutually exclusive with -mfpmath=sse, as there is no such thing as "extended precision" in SSE. __float128, on the other hand, should work just perfectly fine with SSE math (though in absence of quad precision instructions certainly not on a 1:1 instruction base). Am I right in these assumptions?

(3. 和 4. 可能可以通过在分析和反汇编上花费的一些工作来解决，但也许其他人以前也有同样的想法并且已经完成了这项工作.)

(3. and 4. can probably be figured out with some work spent on profiling and disassembling, but maybe someone else had the same thought previously and has already done that work.)

背景(这是 TL;DR 部分):
我最初偶然发现 long double 因为我在 <float.h> 中查找 DBL_MAX，顺便说一下 LDBL_MAX> 在下一行.哦，看，GCC 实际上有 128 位双打，不是我需要它们，但是......很酷"是我的第一个想法.惊喜，惊喜:sizeof(long double) 返回 12...等等，你是说 16?

Background (this is the TL;DR part):
I initially stumbled over long double because I was looking up DBL_MAX in <float.h>, and incidentially LDBL_MAX is on the next line. "Oh look, GCC actually has 128 bit doubles, not that I need them, but... cool" was my first thought. Surprise, surprise: sizeof(long double) returns 12... wait, you mean 16?

毫无疑问，C 和 C++ 标准没有给出非常具体的类型定义.C99 (6.2.5 10) 说 double 的数字是 long double 的子集，而 C++03 声明 (3.9.1 8) longdouble 的精度至少与 double 一样高(这是同一件事，只是措辞不同).基本上，标准将一切留给实现，与 long、int 和 short 的方式相同.

The C and C++ standards unsurprisingly do not give a very concrete definition of the type. C99 (6.2.5 10) says that the numbers of double are a subset of long double whereas C++03 states (3.9.1 8) that long double has at least as much precision as double (which is the same thing, only worded differently). Basically, the standards leave everything to the implementation, in the same manner as with long, int, and short.

维基百科说 GCC 使用在 x86 处理器上使用 80 位扩展精度，而不考虑使用的物理存储".

Wikipedia says that GCC uses "80-bit extended precision on x86 processors regardless of the physical storage used".

GCC 文档在同一页上指出，由于 i386 ABI，类型的大小为 96 位，但任何选项都启用了不超过 80 位的精度(嗯?什么?)，还有Pentium 和更新的处理器希望它们按 128 位数字对齐.这是 64 位下的默认值，可以在 32 位下手动启用，从而产生 32 位的零填充.

The GCC documentation states, all on the same page, that the size of the type is 96 bits because of the i386 ABI, but no more than 80 bits of precision are enabled by any option (huh? what?), also Pentium and newer processors want them being aligned as 128 bit numbers. This is the default under 64 bits and can be manually enabled under 32 bits, resulting in 32 bits of zero padding.

运行测试的时间:

#include <stdio.h>
#include <cfloat>

int main()
{
#ifdef  USE_FLOAT128
    typedef __float128  long_double_t;
#else
    typedef long double long_double_t;
#endif

long_double_t ld;

int* i = (int*) &ld;
i[0] = i[1] = i[2] = i[3] = 0xdeadbeef;

for(ld = 0.0000000000000001; ld < LDBL_MAX; ld *= 1.0000001)
    printf("%08x-%08x-%08x-%08x
", i[0], i[1], i[2], i[3]);

return 0;
}

使用 long double 时的输出看起来有点像这样，标记的数字是恒定的，所有其他数字最终都会随着数字越来越大而变化:

The output, when using long double, looks somewhat like this, with the marked digits being constant, and all others eventually changing as the numbers get bigger and bigger:

5636666b-c03ef3e0-00223fd8-deadbeef
                  ^^       ^^^^^^^^

这表明它不是一个 80 位数字.一个 80 位的数字有 18 个十六进制数字.我看到 22 个十六进制数字在变化，它看起来更像是一个 96 位数字(24 个十六进制数字).它也不是 128 位数字，因为 0xdeadbeef 没有被触及，这与 sizeof 返回 12 一致.

This suggests that it is not an 80 bit number. An 80-bit number has 18 hex digits. I see 22 hex digits changing, which looks much more like a 96 bits number (24 hex digits). It also isn't a 128 bit number since 0xdeadbeef isn't touched, which is consistent with sizeof returning 12.

__int128 的输出看起来真的只是一个 128 位的数字.所有位最终都会翻转.

The output for __int128 looks like it's really just a 128 bit number. All bits eventually flip.

使用 -m128bit-long-double 编译不会不将 long double 对齐到 128 位，并使用 32 位零填充，如所示通过文档.它也不使用 __int128，但确实似乎与 128 位对齐，填充值 0x7ffdd000(?!).

Compiling with -m128bit-long-double does not align long double to 128 bits with a 32-bit zero padding, as indicated by the documentation. It doesn't use __int128 either, but indeed seems to align to 128 bits, padding with the value 0x7ffdd000(?!).

此外，LDBL_MAX 似乎对 long double 和 __float128 都可以用作 +inf.将 1.0E100 或 1.0E2000 之类的数字与 LDBL_MAX 相加或相减会产生相同的位模式.
到目前为止，我认为 foo_MAX 常量将保存最大的可表示数，不是 +inf(显然不是是这样吗?).我也不太确定一个 80 位的数字如何可以想象为 128 位值的 +inf ......也许我在一天结束时太累了并且已经完成了出了点问题.

Further, LDBL_MAX, seems to work as +inf for both long double and __float128. Adding or subtracting a number like 1.0E100 or 1.0E2000 to/from LDBL_MAX results in the same bit pattern.
Up to now, it was my belief that the foo_MAX constants were to hold the largest representable number that is not +inf (apparently that isn't the case?). I'm also not quite sure how an 80-bit number could conceivably act as +inf for a 128 bit value... maybe I'm just too tired at the end of the day and have done something wrong.

推荐答案

广告 1.

这些类型旨在处理具有巨大动态范围的数字.long double 在 x87 FPU 中以本机方式实现.我怀疑 128b double 会在现代 x86 上以软件模式实现，因为没有硬件可以在硬件中进行计算.

Those types are designed to work with numbers with huge dynamic range. The long double is implemented in a native way in the x87 FPU. The 128b double I suspect would be implemented in software mode on modern x86s, as there's no hardware to do the computations in hardware.

有趣的是，连续执行许多浮点运算是很常见的，中间结果实际上并不存储在声明的变量中，而是存储在利用全精度的 FPU 寄存器中.这就是比较的原因:

The funny thing is that it's quite common to do many floating point operations in a row and the intermediate results are not actually stored in declared variables but rather stored in FPU registers taking advantage of full precision. That's why comparison:

double x = sin(0); if (x == sin(0)) printf("Equal!");

不安全，不能保证工作(没有额外的开关).

Is not safe and cannot be guaranteed to work (without additional switches).

广告.3.

速度会受到影响，具体取决于您使用的精度.您可以使用以下方法更改 FPU 的使用精度:

There's an impact on the speed depending what precision you use. You can change used the precision of the FPU by using:

void 
set_fpu (unsigned int mode)
{
  asm ("fldcw %0" : : "m" (*&mode));
}

对于较短的变量会更快，对于较长的变量会更慢.128 位双打可能会在软件中完成，因此速度会慢得多.

It will be faster for shorter variables, slower for longer. 128bit doubles will be probably done in software so will be much slower.

这不仅与 RAM 内存浪费有关，还与缓存浪费有关.从 64b double 到 80 bit double 会浪费 33% (32b) 到几乎 50% (64b) 的内存(包括缓存).

It's not only about RAM memory wasted, it's about cache being wasted. Going to 80 bit double from 64b double will waste from 33% (32b) to almost 50% (64b) of the memory (including cache).

广告 4.

另一方面，我理解long double类型是相互的与 -mfpmath=sse 独占，因为没有扩展精度"在 SSE 中.另一方面，__float128 应该可以正常工作SSE 数学完全没问题(尽管没有四倍精度指令当然不是在 1:1 指令基础上).我在下面吗这些假设?

On the other hand, I understand that the long double type is mutually exclusive with -mfpmath=sse, as there is no such thing as "extended precision" in SSE. __float128, on the other hand, should work just perfectly fine with SSE math (though in absence of quad precision instructions certainly not on a 1:1 instruction base). Am I right under these assumptions?

FPU 和 SSE 单元是完全独立的.您可以在使用 SSE 的同时使用 FPU 编写代码.问题是，如果您将编译器限制为仅使用 SSE，它会生成什么?它会尝试使用 FPU 吗?我一直在用 SSE 进行一些编程，而 GCC 只会自己生成单个 SISD.您必须帮助它使用 SIMD 版本.__float128 可能适用于每台机器，甚至是 8 位 AVR uC.毕竟只是摆弄一些东西.

The FPU and SSE units are totally separate. You can write code using FPU at the same time as SSE. The question is what will the compiler generate if you constrain it to use only SSE? Will it try to use FPU anyway? I've been doing some programming with SSE and GCC will generate only single SISD on its own. You have to help it to use SIMD versions. __float128 will probably work on every machine, even the 8-bit AVR uC. It's just fiddling with bits after all.

十六进制表示中的 80 位实际上是 20 个十六进制数字.也许未使用的位来自某些旧操作?在我的机器上，我编译了你的代码，只有 20 位长的变化模式:66b4e0d2-ec09c1d5-00007ffe-deadbeef

The 80 bit in hex representation is actually 20 hex digits. Maybe the bits which are not used are from some old operation? On my machine, I compiled your code and only 20 bits change in long mode: 66b4e0d2-ec09c1d5-00007ffe-deadbeef

128 位版本的所有位都发生了变化.看看objdump，好像是在使用软件仿真，几乎没有FPU指令.

The 128-bit version has all the bits changing. Looking at the objdump it looks as if it was using software emulation, there are almost no FPU instructions.

此外，LDBL_MAX 似乎对 long double 和__float128.在 LDBL_MAX 中添加或减去 1.0E100 或 1.0E2000 之类的数字会产生相同的位模式.到目前为止，这是我的相信 foo_MAX 常量将保持最大不是 +inf 的可表示数字(显然这不是案件?).

Further, LDBL_MAX, seems to work as +inf for both long double and __float128. Adding or subtracting a number like 1.0E100 or 1.0E2000 to/from LDBL_MAX results in the same bit pattern. Up to now, it was my belief that the foo_MAX constants were to hold the largest representable number that is not +inf (apparently that isn't the case?).

这似乎很奇怪...

我也不太确定 80 位数字怎么可能充当 128 位值的 +inf ......也许我最后太累了那天做错了什么.

I'm also not quite sure how an 80-bit number could conceivably act as +inf for a 128-bit value... maybe I'm just too tired at the end of the day and have done something wrong.

它可能正在延长.在 80 位中被识别为 +inf 的模式也被转换为 128 位浮点数中的 +inf.

It's probably being extended. The pattern which is recognized to be +inf in 80-bit is translated to +inf in 128-bit float too.

这篇关于long double(特定于 GCC)和 __float128的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

long double(特定于 GCC)和 float128 [英] long double (GCC specific) and float128

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

long double(特定于 GCC)和 __float128 [英] long double (GCC specific) and __float128

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

long double(特定于 GCC)和 float128 [英] long double (GCC specific) and float128

登录关闭