通过BigDecimal转换为浮点数的适当比例 [英] Appropriate scale for converting via BigDecimal to floating point

查看:174
本文介绍了通过BigDecimal转换为浮点数的适当比例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了一个任意精度的有理数类,该类需要提供一种转换为浮点数的方法.可以通过BigDecimal直接完成此操作:

I've written an arbitrary precision rational number class that needs to provide a way to convert to floating-point. This can be done straightforwardly via BigDecimal:

return new BigDecimal(num).divide(new BigDecimal(den), 17, RoundingMode.HALF_EVEN).doubleValue();

,但这在除以十进制数字时需要scale参数的值.我选择17作为初始猜测,因为这大约是双精度浮点数的精度,但是我不知道这是否正确.

but this requires a value for the scale parameter when dividing the decimal numbers. I picked 17 as the initial guess because that is approximately the precision of a double precision floating point number, but I don't know whether that's actually correct.

要使用的正确数字是什么?定义为最小数字,使之更大将不会使答案更准确?

What would be the correct number to use, defined as, the smallest number such that making it any larger would not make the answer any more accurate?

推荐答案

简介

没有足够的精度.

Introduction

No finite precision suffices.

该问题带来的问题等同于:

The problem posed in the question is equivalent to:

  • p 的精度可以保证将任何有理数 x 转换为 p 十进制数字然后转换为浮点数会产生浮点数最接近 x (或者,如果出现平局,则是两个最接近的 x 之一)?
  • What precision p guarantees that converting any rational number x to p decimal digits and then to floating-point yields the floating-point number nearest x (or, in case of a tie, either of the two nearest x)?

要等效,请注意问题中显示的BigDecimal除法将num/div返回到选定的小数位数.然后,问题询问增加小数位数是否可以提高结果的准确性.显然,如果浮点数比结果更接近 x ,则可以提高精度.因此,我们询问要保证获得最接近的浮点数(或并列的两个之一),需要多少个小数位.

To see this is equivalent, observe that the BigDecimal divide shown in the question returns num/div to a selected number of decimal places. The question then asks whether increasing that number of decimal places could increase the accuracy of the result. Clearly, if there is a floating-point number nearer x than the result, then the accuracy could be improved. Thus, we are asking how many decimal places are needed to guarantee the closest floating-point number (or one of the tied two) is obtained.

由于BigDecimal提供了一种四舍五入方法,因此我将考虑它们中的任何一种是否足够.对于转换为浮点数,我假设使用了从整到最近的关系到偶数(转换为DoubleFloat时似乎使用BigDecimal).我提供了一个使用IEEE-754 binary64格式的证明,Java将其用于Double,但是通过将下面使用的2 52 更改为2 ,该证明适用于任何二进制浮点格式. w -1 ,其中 w 是有效位数.

Since BigDecimal offers a choice of rounding methods, I will consider whether any of them suffices. For the conversion to floating-point, I presume round-to-nearest-ties-to-even is used (which BigDecimal appears to use when converting to Double or Float). I give a proof using the IEEE-754 binary64 format, which Java uses for Double, but the proof applies to any binary floating-point format by changing the 252 used below to 2w-1, where w is the number of bits in the significand.

BigDecimal除法的参数之一是舍入方法. Java的BigDecimal具有几种舍入方法.我们只需要考虑三个,ROUND_UP,ROUND_HALF_UP和ROUND_HALF_EVEN.通过使用各种对称性,其他参数的论点类似于下面的论点.

One of the parameters to a BigDecimal division is the rounding method. Java’s BigDecimal has several rounding methods. We only need to consider three, ROUND_UP, ROUND_HALF_UP, and ROUND_HALF_EVEN. Arguments for the others are analogous to those below, by using various symmetries.

在下面,假设我们使用任何大精度的 p 转换为十进制.也就是说, p 是转换结果中的小数位数.

In the following, suppose we convert to decimal using any large precision p. That is, p is the number of decimal digits in the result of the conversion.

m 为有理数2 52 + 1 + 1/2−10 - p .与 m 相邻的两个binary64数字为2 52 +1和2 52 +2. m 更接近第一个,因此我们需要将 m 首先转换为十进制,然后转换为浮点.

Let m be the rational number 252+1+½−10p. The two binary64 numbers neighboring m are 252+1 and 252+2. m is closer to the first one, so that is the result we require from converting m first to decimal and then to floating-point.

在十进制中, m 是4503599627370497.4999…,其中 p -1尾随9s.当使用ROUND_UP,ROUND_HALF_UP或ROUND_HALF_EVEN舍入到 p 个有效数字时,结果为4503599627370497.5 = 2 52 + 1 + 1/2. (认识到,在发生舍入的位置,有16个尾随的9被丢弃,实际上是相对于舍入位置的.9999999999999999的分数.在ROUND_UP中,任何非零丢弃量都会导致舍入.在ROUND_HALF_UP和ROUND_HALF_EVEN中,a丢弃大于该位置½的值会导致四舍五入.)

In decimal, m is 4503599627370497.4999…, where there are p−1 trailing 9s. When rounded to p significant digits with ROUND_UP, ROUND_HALF_UP, or ROUND_HALF_EVEN, the result is 4503599627370497.5 = 252+1+½. (Recognize that, at the position where rounding occurs, there are 16 trailing 9s being discarded, effectively a fraction of .9999999999999999 relative to the rounding position. In ROUND_UP, any non-zero discarded amount causes rounding up. In ROUND_HALF_UP and ROUND_HALF_EVEN, a discarded amount greater than ½ at that position causes rounding up.)

2 52 + 1 + 1/2同样接近相邻的二进制64数字2 52 +1和2 52 +2,因此舍入到最近的关系到偶数方法将产生2 52 +2.

252+1+½ is equally close to the neighboring binary64 numbers 252+1 and 252+2, so the round-to-nearest-ties-to-even method produces 252+2.

因此,结果为2 52 +2,这不是最接近 m 的binary64值.

Thus, the result is 252+2, which is not the binary64 value closest to m.

因此,没有有限精度的 p 足以正确舍入所有有理数.

Therefore, no finite precision p suffices to round all rational numbers correctly.

这篇关于通过BigDecimal转换为浮点数的适当比例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆