双精度浮点和浮点舍入 [英] Double in place of Float and Float rounding

查看:186
本文介绍了双精度浮点和浮点舍入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

编辑:
此问题包含两个主题:

This question covers two topics:


  • 使用double代替float的效率

  • 舍入后的浮点精度

有什么理由为什么我不应该总是使用Java double而不是float ?

Is there any reason why I should not always use Java double instead of float?

我问这个问题,因为这个测试代码在使用浮动时是失败的,不清楚为什么,因为唯一的区别是使用float而不是double。

I ask this question because this test code when using floats is failing and not clear why since the only difference is the use of float instead of double.

public class BigDecimalTest {
@Test public void testDeltaUsingDouble() { //test passes
    BigDecimal left = new BigDecimal("0.99").setScale(2,BigDecimal.ROUND_DOWN);
    BigDecimal right = new BigDecimal("0.979").setScale(2,BigDecimal.ROUND_DOWN);

    Assert.assertEquals(left.doubleValue(), right.doubleValue(), 0.09);
    Assert.assertEquals(left.doubleValue(), right.doubleValue(), 0.03);

    Assert.assertNotEquals(left.doubleValue(), right.doubleValue(), 0.02);
    Assert.assertNotEquals(left.doubleValue(), right.doubleValue(), 0.01);
    Assert.assertNotEquals(left.doubleValue(), right.doubleValue(), 0.0);
}
@Test public void testDeltaUsingFloat() {  //test fails on 'failing assert'

    BigDecimal left = new BigDecimal("0.99").setScale(2,BigDecimal.ROUND_DOWN);
    BigDecimal right = new BigDecimal("0.979").setScale(2,BigDecimal.ROUND_DOWN);

    Assert.assertEquals(left.floatValue(), right.floatValue(), 0.09);
    Assert.assertEquals(left.floatValue(), right.floatValue(), 0.03);

    /* failing assert */ Assert.assertNotEquals(left.floatValue() + " - " + right.floatValue() + " = " + (left.floatValue() - right.floatValue()),left.floatValue(), right.floatValue(), 0.02);
    Assert.assertNotEquals(left.floatValue(), right.floatValue(), 0.01);
    Assert.assertNotEquals(left.floatValue(), right.floatValue(), 0.0);
}}

失败消息:

java.lang.AssertionError: 0.99 - 0.97 = 0.01999998. Actual: 0.9900000095367432
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failEquals(Assert.java:185)
at org.junit.Assert.assertNotEquals(Assert.java:230)
at com.icode.common.BigDecimalTest.testDeltaUsingFloat(BigDecimalTest.java:34)

任何想法为什么这个测试失败,为什么我不应该总是使用double而不是float?当然,除了double之外的原因比float更宽。

Any idea why this test fails and why I shouldn't just always use double instead of float? of course a reason other than a double is wider than a float.

编辑:
有趣的是,Assert.assertNotEquals(double,double,delta)在两种情况下都是double,所以返回的float因此为什么测试失败?

The funny things is that Assert.assertNotEquals(double,double,delta) takes double in both cases so the returned floats in the failing test are getting widened as doubles anyway so why the test failure then?

编辑:
可能是这个其他问题是相关的,但不知道:
hex不一样

编辑:
从此问题的答案十六进制不相同,可以得出结论,对于相同的值,用于浮动的.99的IEEE 754的科学表示不同于双精度。

From the answer to this question hex not the same it can be concluded that the scientific representation IEEE 754 for .99 for float is different from double for the same value. This is due the rounding.

因此,我们得到这个结果:

Hence we get this:


  • 0.97 = 0.01999998 // in float case

  • 0.99 - 0.97 = 0.020000000000000018 // in double case

由于上述单元测试中的最大增量为0.02,而0.01999998(在失败测试中)低于增量值,意味着数字被认为是相同的,但测试声明它们不会失败。

Since the max delta in the above unit test is 0.02 and 0.01999998 (in the failing test) is below the delta value meaning that the numbers are seen to be the same but the test is asserting they are not hence failing.

你同意这一切吗?

推荐答案

BigDecimal的文档没有提及 floatValue() rounds。

The documentation for BigDecimal is silent about how floatValue() rounds. I presume it uses round-to-nearest, ties-to-even.

right 分别设置为.99和.97。当这些在round-to-nearest模式下转换为 double 时,结果为0.9899999999999999911182158029987476766109466552734375(十六进制浮点数,0x1.fae147ae147aep-1)和0.9699999999999999733546474089962430298328399658203125(0x1.f0a3d70a3d70ap -1)。当这些减去时,结果是0.020000000000000017763568394002504646778106689453125,这明显超过了.02。

left and right are set to .99 and .97, respectively. When these are converted to double in round-to-nearest mode, the results are 0.9899999999999999911182158029987476766109466552734375 (in hexadecimal floating-point, 0x1.fae147ae147aep-1) and 0.9699999999999999733546474089962430298328399658203125 (0x1.f0a3d70a3d70ap-1). When those are subtracted, the result is 0.020000000000000017763568394002504646778106689453125, which clearly exceeds .02.

当.99和.97转换为 float ,结果是0.9900000095367431640625(0x1.fae148p-1)和0.9700000286102294921875(0x1.f0a3d8p-1)。当减去这些值时,结果是0.019999980926513671875,这显然小于.02。

When .99 and .97 are converted to float, the results are 0.9900000095367431640625 (0x1.fae148p-1) and 0.9700000286102294921875 (0x1.f0a3d8p-1). When those are subtracted, the result is 0.019999980926513671875, which is clearly less than .02.

简单来说,当十进制数转换为浮点数时,舍入可能向上或向下。它取决于数字相对于最近的可表示浮点值的位置。如果不进行控制或分析,它实际上是随机的。

Simply put, when a decimal numeral is converted to floating-point, the rounding may be up or down. It depends on where the number happens to lie relative to the nearest representable floating-point values. If it is not controlled or analyzed, it is practically random. Thus, sometimes you end up with a greater value than you might have expected, and sometimes you end up with a lesser value.

使用 double

Using double instead of float would not guarantee that results similar to the above do not occur. It is merely happenstance that the double value in this case exceeded the exact mathematical value and the float value did not. With other numbers, it could be the other way around. For example, with double, .09-.07 is less than .02, but, with float, .09f - .07f` is greater than .02.

有很多关于如何处理浮动的信息点算术,例如 浮点算术手册 。它是太大的主题覆盖在Stack Overflow问题。有大学课程。

There is a lot of information about how to deal with floating-point arithmetic, such as Handbook of Floating-Point Arithmetic. It is too large a subject to cover in Stack Overflow questions. There are university courses on it.

通常在今天的典型处理器上,使用 double float ;对于 double float 以几乎相同的速度执行简单标量浮点运算。当有这么多数据时,将它们(从磁盘到内存或内存到处理器)传输的时间变得很重要,或者它们在磁盘上占用的空间变大,或者您的软件使用处理器的SIMD功能时,性能差异就会出现。 (SIMD允许处理器对多个数据并行执行相同的操作,当前处理器通常为 float SIMD操作提供大约两倍的带宽,因为双重 SIMD操作,或根本不提供双重 SIMD操作。)

Often on today’s typical processors, there is little extra expense for using double rather than float; simple scalar floating-point operations are performed at nearly the same speeds for double and float. Performance differences arise when you have so much data that the time to transfer them (from disk to memory or memory to processor) becomes important, or the space they occupy on disk becomes large, or your software uses SIMD features of processors. (SIMD allows processors to perform the same operation on multiple pieces of data, in parallel. Current processors typically provide about twice the bandwidth for float SIMD operations as for double SIMD operations or do not provide double SIMD operations at all.)

这篇关于双精度浮点和浮点舍入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆