printf如何从浮点数提取数字? [英] How does printf extract digits from a floating point number?

查看:53
本文介绍了printf如何从浮点数提取数字?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

printf 之类的功能如何从浮点数中提取数字?我了解原则上可以做到这一点.给定一个数字 x ,您想要第一个 n 个数字,将 x 缩放10的幂,以便使 x pow(10,n) pow(10,n-1)之间.然后将 x 转换为整数,并获取整数.

我尝试了一下,并且奏效了.有点.我的答案与 printf 对前16个十进制数字给出的答案相同,但此后的数字往往有所不同. printf 如何做到?

解决方案

经典实现是David Gay的 dtoa .确切的细节有些奥秘(请参阅为什么"dtoa.c"包含这么多代码?),但是通常它可以通过比32位,64位甚至80位浮点数所能提供的精度更高的精度来进行基本转换.为此,它使用了所谓的"bigints"或任意精度数字,它们可以容纳尽可能多的数字,以适合您的内存大小.盖伊(Gay)的代码经过修改后已被复制到无数其他库中,包括C标准库的通用实现(因此可能为您的 printf 提供支持),Java,Python,PHP,JavaScript等.>

(作为一个补充说明...并非Gay的所有这些dtoa代码副本都保持最新,因此,因为PHP在解析2.2250738585072011e-308时使用了strem的版本挂起了)

通常,如果您以显而易见"且简单的方式进行操作(例如乘以10的幂然后转换为整数),则会损失少量精度,并且某些结果将不准确...但是也许您会得到正确的前14或15位数字.Gay的dtoa()实现声称所有数字正确无误...但是结果是,代码很难遵循.跳到底部查看strtod本身,您可以看到它以快速路径"开头,该路径仅使用普通的浮点算法,但随后它检测到该结果是否不正确,并使用一个更可靠的算法使用bigints来工作所有情况下(但速度较慢).

该实现具有以下引文,您可能会发现它很有趣:

*受到如何准确打印浮点数"的启发* Guy L. Steele,Jr.和Jon L. White [Proc.ACM SIGPLAN '90,第112-126页].

该算法通过计算产生给定二进制数的十进制数字范围来工作,并且通过使用更多数字,范围会越来越小,直到您获得准确的结果或可以正确舍入到所需的数字位数为止

尤其是从2.2版算法开始,

该算法使用精确的有理算法来执行其计算,因此不会损失准确性.为了生成数字,该算法对数字进行缩放,使其形式为0.d 1 d 2 ...,其中d 1 ,d 2 ,...是基数B的数字.通过将缩放数字乘以输出底数B并取整数部分来计算第一位数字.其余部分用于使用相同方法来计算其余数字.

然后,该算法可以继续执行,直到获得准确的结果(由于浮点数以2为底,而2是10的因数,所以这总是可能的),或者直到它具有所需的位数为止.论文继续证明了该算法的正确性.


还请注意,并非 printf 的所有实现都基于Gay的dtoa,这只是一个特别常见的实现,已被大量复制.

How do functions such as printf extract digits from a floating point number? I understand how this could be done in principle. Given a number x, of which you want the first n digits, scale x by a power of 10 so that x is between pow(10, n) and pow(10, n-1). Then convert x into an integer, and take the digits of the integer.

I tried this, and it worked. Sort of. My answer was identical to the answer given by printf for the first 16 decimal digits, but tended to differ on the ones after that. How does printf do it?

解决方案

The classic implementation is David Gay's dtoa. The exact details are somewhat arcane (see Why does "dtoa.c" contain so much code?), but in general it works by doing the base conversion using more precision beyond what you can get from a 32-bit, 64-bit, or even 80-bit floating point number. To do this, it uses so-called "bigints" or arbitrary-precision numbers, which can hold as many digits as you can fit in memory. Gay's code has been copied, with modifications, into countless other libraries including common implementations for the C standard library (so it might power your printf), Java, Python, PHP, JavaScript, etc.

(As a side note... not all of these copies of Gay's dtoa code were kept up to date, so because PHP used an old version of strtod it hung when parsing 2.2250738585072011e-308.)

In general, if you do things the "obvious" and simple way like multiplying by a power of 10 and then converting the integer, you will lose a small amount of precision and some of the results will be inaccurate... but maybe you will get the first 14 or 15 digits correct. Gay's implementation of dtoa() claims to get all the digits correct... but as a result, the code is quite difficult to follow. Skip to the bottom to see strtod itself, you can see that it starts with a "fast path" which just uses ordinary floating-point arithmetic, but then it detects if that result is incorrect and uses a more reliable algorithm using bigints which works in all cases (but is slower).

The implementation has the following citation, which you may find interesting:

 * Inspired by "How to Print Floating-Point Numbers Accurately" by
 * Guy L. Steele, Jr. and Jon L. White [Proc. ACM SIGPLAN '90, pp. 112-126].

The algorithm works by calculating a range of decimal numbers which produce the given binary number, and by using more digits, the range gets smaller and smaller until you either have an exact result or you can correctly round to the requested number of digits.

In particular, from sec 2.2 Algorithm,

The algorithm uses exact rational arithmetic to perform its computations so that there is no loss of accuracy. In order to generate digits, the algorithm scales the number so that it is of the form 0.d1d2..., where d1, d2, ..., are base-B digits. The first digit is computed by multiplying the scaled number by the output base, B, and taking the integer part. The remainder is used to compute the rest of the digits using the same approach.

The algorithm can then continue until it has the exact result (which is always possible, since floating-point numbers are base 2, and 2 is a factor of 10) or until it has as many digits as requested. The paper goes on to prove the algorithm's correctness.


Also note that not all implementations of printf are based on Gay's dtoa, this is just a particularly common implementation that's been copied a lot.

这篇关于printf如何从浮点数提取数字?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆