Golang浮点精度float32 vs float64 [英] Golang floating point precision float32 vs float64

查看:1393
本文介绍了Golang浮点精度float32 vs float64的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  func main(){
a:= float64(0.2)
a + = 0.1
a - = 0.3
var i int
for i = 0;一个< 1.0; i ++ {
a + = a
}
fmt.Printf(After%d iterations,a =%e \\\
,i,a)
}

打印:

 经过54次迭代后,a = 1.000000e + 00 

在C中(使用 double 类型)

但是,如果 float32 被用来代替,程序被困在无限循环中!如果你修改C程序来使用 float 而不是 double ,它会打印

 经过27次迭代后,a = 1.600000e + 00 

为什么Go程序在使用 float32

解决方案

同意ANisus,去做正确的事情。关于C,我不相信他的猜测。



C标准没有规定,但是libc的大部分实现将十进制表示转换为最接近的浮点数(至少遵守IEEE-754 2008或ISO 10967),所以我不认为这是最可能的解释。

C程序行为可能有几个原因不同的...特别是,一些中间计算可能会执行超精度(双或长双)。

我能想到的最可能的事情是,如果你在C中写0.1而不是0.1f。

在这种情况下,你可能会在初始化过程中导致超精度

(你总结float a + double 0.1 => float被转换为如果我模拟这些操作

 <$> 

如果我模拟这些操作

c $ c> float32(float32(float32(0.2)+ float64(0.1)) - float64(0.3))



然后我找到了1.19209附近的东西29e-8f

经过27次迭代,总和为1.6f

I wrote a program to demonstrate floating point error in Go:

func main() {
    a := float64(0.2) 
    a += 0.1
    a -= 0.3
    var i int
    for i = 0; a < 1.0; i++ {
        a += a
    }
    fmt.Printf("After %d iterations, a = %e\n", i, a)
}

It prints:

After 54 iterations, a = 1.000000e+00

This matches the behaviour of the same program written in C (using the double type)

However, if float32 is used instead, the program gets stuck in an infinite loop! If you modify the C program to use a float instead of a double, it prints

After 27 iterations, a = 1.600000e+00

Why doesn't the Go program have the same output as the C program when using float32?

解决方案

Agree with ANisus, go is doing the right thing. Concerning C, I'm not convinced by his guess.

The C standard does not dictate, but most implementations of libc will convert the decimal representation to nearest float (at least to comply with IEEE-754 2008 or ISO 10967), so I don't think this is the most probable explanation.

There are several reasons why the C program behavior might differ... Especially, some intermediate computations might be performed with excess precision (double or long double).

The most probable thing I can think of, is if ever you wrote 0.1 instead of 0.1f in C.
In which case, you might have cause excess precision in initialization
(you sum float a+double 0.1 => the float is converted to double, then result is converted back to float)

If I emulate these operations

float32(float32(float32(0.2) + float64(0.1)) - float64(0.3))

Then I find something near 1.1920929e-8f

After 27 iterations, this sums to 1.6f

这篇关于Golang浮点精度float32 vs float64的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆