python中float的基础数据结构 [英] underlying data structure for float in python

查看:131
本文介绍了python中float的基础数据结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有一个关于Python中float(和精度)的底层数据结构的问题:

Got a question regarding to the underlying data structure of float (and precision) in Python:

>>> b = 1.4 + 2.3
>>> b
3.6999999999999997

>>> c = 3.7
>>> c
3.7000000000000002

>>> print b, c
3.7  3.7

>>> b == c
False

看来b和c的值与机器有关,它们是最接近目标值但不完全相同的数字.我被监督说我们通过'Print'得到了'正确'的数字,有人告诉我这是因为print'lies'而Python选择了告诉我们真相,即确切地显示了它们存储的内容.

it seems the values of b and c are machine dependent, they are the numbers that closest to the target values but not exactly the same numbers. I was supervised that we get the 'right' numbers with 'Print', and someone told me that it was because print 'lies' while Python chose to tell us the truth i.e. showing exactly what they have stored.

我的问题是:

1.如何说谎?例如在函数中,我们采用两个值,如果它们相同则返回,如果小数(精度)的数量未知,我怎么能最好地猜测呢?就像上面提到的b和c一样?有一个定义明确的算法可以做到这一点吗?有人告诉我,如果涉及浮点计算,每种语言(C/C ++)都会遇到这种问题,但是它们如何解决"这一问题?

2.为什么我们不能只存储实际数字而不是存储最接近的数字?是限制还是为了效率?

And my questions are:

1. How to lie? e.g. in a function we take two values and return if they are the same, how I could have a best guess if the number of decimal(precision) is unknown? like b and c mentioned above? is there a well defined algorithm to do that? I was told that every language (C/C++) will have this kind of issue if we have floating point calculation involved, but how do they 'solve' this?

2. why we cannot just store the actual number instead of storing the closest number? is it a limitation or trading for efficiency?

非常感谢 约翰

推荐答案

第一个问题的答案,请看下面来自Python源代码的(略作浓缩)代码:

For the answer to your first question, take a look at the following (slightly condensed) code from Python's source:

#define PREC_REPR       17
#define PREC_STR        12

void PyFloat_AsString(char *buf, PyFloatObject *v) {
    format_float(buf, 100, v, PREC_STR);
}

void PyFloat_AsReprString(char *buf, PyFloatObject *v) {
    format_float(buf, 100, v, PREC_REPR);
}

因此,基本上,repr(float)将返回精度为17位数字的字符串,而str(float)将返回精度为12位数字的字符串.您可能已经猜到了,print使用str(),并且在解释器中输入变量名称使用repr().仅12位数字的精度,就好像您得到了正确"的答案,但这仅仅是因为您的期望值和实际值在12位数字以下是相同的.

So basically, repr(float) will return a string formatted with 17 digits of precision, and str(float) will return a string with 12 digits of precision. As you might have guessed, print uses str() and entering the variable name in the interpreter uses repr(). With only 12 digits of precision, it looks like you get the "correct" answer, but that is just because what you expect and the actual value are the same up to 12 digits.

这是区别的一个简短示例:

Here is a quick example of the difference:

>>> str(.1234567890123)
'0.123456789012'
>>> repr(.1234567890123)
'0.12345678901230001'

关于第二个问题,建议您阅读Python教程的以下部分:

As for your second question, I suggest you read the following section of the Python tutorial: Floating Point Arithmetic: Issues and Limitations

在基数2中存储以10为基数的小数时,比起任何其他表示形式,它可以归结为效率,更少的内存和更快的浮点运算,但是您确实需要处理不精确性.

It boils down to efficiency, less memory and quicker floating point operations when you are storing base 10 decimals in base 2 than any other representation, but you do need to deal with the imprecision.

正如JBernardo在评论中指出的那样,此行为在Python 2.7及更高版本中有所不同,以下教程链接中的以下引文描述了不同之处(以0.1为例):

As JBernardo pointed out in comments, this behavior is different in Python 2.7 and above, the following quote from the above tutorial link describes the difference (using 0.1 as an example):

在Python 2.7和Python 3.1之前的版本中,Python对此进行了四舍五入 值应为17个有效数字,即为"0.10000000000000001".在 当前版本,Python显示基于最短值的值 正确舍入为真实二进制值的小数部分, 结果只是"0.1".

In versions prior to Python 2.7 and Python 3.1, Python rounded this value to 17 significant digits, giving ‘0.10000000000000001’. In current versions, Python displays a value based on the shortest decimal fraction that rounds correctly back to the true binary value, resulting simply in ‘0.1’.

这篇关于python中float的基础数据结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆