为什么双倍优先于浮动? [英] Why are double preferred over float?

查看:122
本文介绍了为什么双倍优先于浮动?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我看到的大多数代码中, double float 的最爱,即使高精度不需要。



由于存在性能惩罚当使用双重类型(CPU / GPU /内存/总线/缓存/ ...)时,这种双重过度使用的原因是什么?



:在计算流体动力学中,我使用的所有软件都使用双精度。在这种情况下,高精度是无用的(因为由于数学模型中的近似的误差),并且存在大量的要移动的数据,其可以使用浮点数削减一半。



今天的计算机功能强大的事实是无意义的,因为它们用于解决越来越复杂的问题。



在我看来,到目前为止的答案没有真正得到正确的点,所以这里是我的裂痕。 >简单的答案是C ++开发人员使用浮点数的双精度:




  • 为了避免过早优化,他们不了解性能权衡(他们有更高的精度,为什么不?是思想过程)

  • 习惯

  • 文化

  • 匹配库函数签名

  • 匹配简单写入浮点文字(可以写0.0而不是0.0f)



对于单个计算,真双精度可以像浮点一样快,因为大多数FPU具有比32位浮点或64位双精度表示更宽的内部表示。



然而,这只是图片的一小块。



这里是为什么一些想要优化他们的代码的开发人员应该查看使用超过64位元的32位浮动广告:




  • 它们只占内存的一半。这就像有你的所有缓存的两倍大。 (big win !!!)

  • 如果你真的关心性能,你将使用SSE指令。对浮点值操作的SSE指令对于32位和64位浮点表示具有不同的指令。 32位版本可以在128位寄存器操作数中匹配4个值,但是64位版本只能适合2个值。在这种情况下,你可以使用双倍的浮点数来加倍你的FLOPS,因为每个指令的操作数据是两倍。



有一个真正缺乏知道如何浮点数真正工作在我遇到的大多数开发人员。所以我并不惊讶大多数开发者盲目使用双。


In most of the code I see around, double is favourite against float, even when a high precision is not needed.

Since there are performance penalties when using double types (CPU/GPU/memory/bus/cache/...), what is the reason of this double overuse?

Example: in computational fluid dynamics all the software I worked with uses doubles. In this case a high precision is useless (because of the errors due to the approximations in the mathematical model), and there is a huge amount of data to be moved around, which could be cut in half using floats.

The fact that today's computers are powerful is meaningless, because they are used to solve more and more complex problems.

解决方案

In my opinion the answers so far don't really get the right point across, so here's my crack at it.

The short answer is C++ developers use doubles over floats:

  • To avoid premature optimization when they don't understand the performance trade-offs well ("they have higher precision, why not?" Is the thought process)
  • Habit
  • Culture
  • To match library function signatures
  • To match simple-to-write floating point literals (you can write 0.0 instead of 0.0f)

It's true double may be as fast as a float for a single computation because most FPUs have a wider internal representation than either the 32-bit float or 64-bit double represent.

However that's only a small piece of the picture. Now-days operational optimizations don't mean anything if you're bottle necked on cache/memory bandwidth.

Here is why some developers seeking to optimize their code should look into using 32-bit floats over 64-bit doubles:

  • They fit in half the memory. Which is like having all your caches be twice as large. (big win!!!)
  • If you really care about performance you'll use SSE instructions. SSE instructions that operate on floating point values have different instructions for 32-bit and 64-bit floating point representations. The 32-bit versions can fit 4 values in the 128-bit register operands, but the 64-bit versions can only fit 2 values. In this scenario you can likely double your FLOPS by using floats over double because each instruction operates on twice as much data.

In general, there is a real lack of knowledge of how floating point numbers really work in the majority of developers I've encountered. So I'm not really surprised most developers blindly use double.

这篇关于为什么双倍优先于浮动?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆