Golang浮点精度float32 vs float64 [英] Golang floating point precision float32 vs float64
问题描述
func main(){
a:= float64(0.2)
a + = 0.1
a - = 0.3
var i int
for i = 0;一个< 1.0; i ++ {
a + = a
}
fmt.Printf(After%d iterations,a =%e \\\
,i,a)
}
打印:
经过54次迭代后,a = 1.000000e + 00
在C中(使用 double
类型)
但是,如果 float32
被用来代替,程序被困在无限循环中!如果你修改C程序来使用 float
而不是 double
,它会打印
经过27次迭代后,a = 1.600000e + 00
为什么Go程序在使用 float32
?
同意ANisus,去做正确的事情。关于C,我不相信他的猜测。
C标准没有规定,但是libc的大部分实现将十进制表示转换为最接近的浮点数(至少遵守IEEE-754 2008或ISO 10967),所以我不认为这是最可能的解释。
C程序行为可能有几个原因不同的...特别是,一些中间计算可能会执行超精度(双或长双)。
我能想到的最可能的事情是,如果你在C中写0.1而不是0.1f。
在这种情况下,你可能会在初始化过程中导致超精度
(你总结float a + double 0.1 => float被转换为如果我模拟这些操作
<$>
如果我模拟这些操作
c $ c> float32(float32(float32(0.2)+ float64(0.1)) - float64(0.3))
然后我找到了1.19209附近的东西29e-8f
经过27次迭代,总和为1.6f
I wrote a program to demonstrate floating point error in Go:
func main() {
a := float64(0.2)
a += 0.1
a -= 0.3
var i int
for i = 0; a < 1.0; i++ {
a += a
}
fmt.Printf("After %d iterations, a = %e\n", i, a)
}
It prints:
After 54 iterations, a = 1.000000e+00
This matches the behaviour of the same program written in C (using the double
type)
However, if float32
is used instead, the program gets stuck in an infinite loop! If you modify the C program to use a float
instead of a double
, it prints
After 27 iterations, a = 1.600000e+00
Why doesn't the Go program have the same output as the C program when using float32
?
Agree with ANisus, go is doing the right thing. Concerning C, I'm not convinced by his guess.
The C standard does not dictate, but most implementations of libc will convert the decimal representation to nearest float (at least to comply with IEEE-754 2008 or ISO 10967), so I don't think this is the most probable explanation.
There are several reasons why the C program behavior might differ... Especially, some intermediate computations might be performed with excess precision (double or long double).
The most probable thing I can think of, is if ever you wrote 0.1 instead of 0.1f in C.
In which case, you might have cause excess precision in initialization
(you sum float a+double 0.1 => the float is converted to double, then result is converted back to float)
If I emulate these operations
float32(float32(float32(0.2) + float64(0.1)) - float64(0.3))
Then I find something near 1.1920929e-8f
After 27 iterations, this sums to 1.6f
这篇关于Golang浮点精度float32 vs float64的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!