在Python和一般的浮点平等 [英] floating point equality in Python and in general
问题描述
我有一段代码,其行为有所不同,具体取决于我是通过字典获取转换因子还是直接使用它们。
代码将打印 1.0 == 1.0 - > False
但如果用因子[units_from]
> 10.0 和因素[units_to]
与 1.0 / 2.54
它会打印 1.0 == 1.0 - > True
#!/ usr / bin / env python
base ='cm'
因子= {
'cm':1.0,
'mm':10.0,
'm':0.01,
'km':1.0 e-5,
'in:1.0 / 2.54,
'ft':1.0 / 2.54 / 12.0,
'yd':1.0 / 2.54 / 12.0 / 3.0,
'mile':1.0 / 2.54 / 12.0 / 5280,
'lightyear':1.0 / 2.54 / 12.0 / 5280 / 5.87849981e12,
}
#将25.4毫米转换为英寸
val = 25.4
units_from ='mm'
units_to ='in'
ase_value = val / factors [units_from]
ret = base_value * factors [units_to]
print ret,'==',1.0,' - >',ret == 1.0
首先让我说,我很确定这里发生了什么事情。我之前在C中看到过,从来没有在Python中,但是因为在C中实现了Python,所以我们看到了它。
我知道浮点数会改变数值从一个CPU寄存器缓存并返回。我知道,比较两个相等的变量将会返回false,如果其中一个被分页,而另一个留在寄存器。
注意 这显然是一个精简的例子的一部分,但我想要做的是与类,长度,体积等,可以比较对同一类的其他对象,但与不同单位。
$ b 修辞问题
感谢您的回复。大多数是非常好的,并提供了很好的联系,所以我只会说,并回答我自己的问题。
他还提到Google测试使用ULP比较,当我看到谷歌代码时,我看到他们提到了与cygnus-software完全相同的链接。
在C中实现一些算法作为Python的扩展,后来发现我也可以用纯Python来实现。代码发布如下。
最后,我可能会结束添加ULP差异到我的伎俩袋。
有趣的是,看到两个相同的数字之间有多少浮点,这些数字永远不会留下内存。其中一篇文章或者我读的google代码说4是个很好的数字,但是在这里我能够达到10分。
>>> f1 = 25.4
>>> f2 = f1
>>>
>>>对于我在xrange(1,11)中:
... f2 / = 10.0#to cm
... f2 * =(1.0 / 2.54)#to in
... f2 * = 25.4#回到mm
...打印'在%2d循环之后有%2d在它们之间加倍'%(i,dulpdiff(f1,f2))
...
1循环之后有1双打他们之间
2循环之间有两个双打之间
3循环之间有3双打之间
4循环之后有4个双打之间
在5个循环之后有6个双打之间
6个循环之后他们之间有7个双打
在7个循环之后有8个双打之间
8个循环之后有10个双打之间
在9个循环后有10个双打
在10个循环之后有10个双打
另外有趣的是,当其中一个被写成字符串并读回来时,在相同数字之间有多少个浮点数。 b
>>> #0华氏度是-32 / 1.8摄氏度
... f = -32 / 1.8
>>> s = str(f)
>>> s
'-17.7777777778'
>>> #浮动之间...
... fulpdiff(f,float(s))
0
>>> #双打之间...
... dulpdiff(f,float(s))
6255L
import struct $ b $ from functools import partial
#(c)2010 Eric L. Frederich
#
#这里详细介绍的算法的Python实现...
#from http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm
def c_mem_cast(x,f = None,t = None):
'''
做一个c式内存转换
在Python .. 。
x = 12.34
y = c_mem_cast(x,'d','l')
...应该等同于以下内容:
double x = 12.34;
long y = *(long *)& x;
'''
return struct.unpack(t,struct.pack(f,x))[0]
dbl_to_lng = partial(c_mem_cast,f ='d', t ='l')
lng_to_dbl =部分(c_mem_cast,f ='l',t ='d')
flt_to_int =部分(c_mem_cast,f ='f',t ='i')
int_to_flt = partial(c_mem_cast,f ='i',t ='f')
def ulp_diff_maker(converter,negative_zero):
'''
浮动和双打的ulp差异是相似的。
只有在偏移和转换器不同的情况下。
'''
def the_diff(a,b):
使整数按照字典顺序排列为二进制补码int
ai = converter(a)
if ai< 0:
ai = negative_zero - ai
#使b整数按字典顺序排列为二进制补码int
bi =转换器(b)
如果bi < 0:
bi = negative_zero - bi
返回abs(ai - bi)
返回the_diff
#double ULP差额
dulpdiff = ulp_diff_maker(dbl_to_lng,0x8000000000000000)
#float ULP差额
fulpdiff = ulp_diff_maker(flt_to_int,0x80000000)
默认值为双重ULP差额
ulpdiff = dulpdiff
ulpdiff .__ doc__ ='''
获得两个双打之间的双打数量。
'''
I have a piece of code that behaves differently depending on whether I go through a dictionary to get conversion factors or whether I use them directly.
The following piece of code will print 1.0 == 1.0 -> False
But if you replace factors[units_from]
with 10.0
and factors[units_to ]
with 1.0 / 2.54
it will print 1.0 == 1.0 -> True
#!/usr/bin/env python
base = 'cm'
factors = {
'cm' : 1.0,
'mm' : 10.0,
'm' : 0.01,
'km' : 1.0e-5,
'in' : 1.0 / 2.54,
'ft' : 1.0 / 2.54 / 12.0,
'yd' : 1.0 / 2.54 / 12.0 / 3.0,
'mile' : 1.0 / 2.54 / 12.0 / 5280,
'lightyear' : 1.0 / 2.54 / 12.0 / 5280 / 5.87849981e12,
}
# convert 25.4 mm to inches
val = 25.4
units_from = 'mm'
units_to = 'in'
base_value = val / factors[units_from]
ret = base_value * factors[units_to ]
print ret, '==', 1.0, '->', ret == 1.0
Let me first say that I am pretty sure what is going on here. I have seen it before in C, just never in Python but since Python in implemented in C we're seeing it.
I know that floating point numbers will change values going from a CPU register to cache and back. I know that comparing what should be two equal variables will return false if one of them was paged out while the other stayed resident in a register.
Questions
- What is the best way to avoid problems like this?... In Python or in general.
- Am I doing something completely wrong?
Side Note
This is obviously part of a stripped down example but what I'm trying to do is come with with classes of length, volume, etc that can compare against other objects of the same class but with different units.
Rhetorical Questions
- If this is a potentially dangerous problem since it makes programs behave in an undetermanistic matter, should compilers warn or error when they detect that you're checking equality of floats
- Should compilers support an option to replace all float equality checks with a 'close enough' function?
- Do compilers already do this and I just can't find the information.
Thanks for your responses. Most were very good and provided good links so I'll just say that and answer my own question.
Caspin posted this link.
He also mentioned that Google Tests used ULP comparison and when I looked at the google code I saw that they mentioned the same exact link to cygnus-software.
I wound up implementing some of the algorithms in C as a Python extension and then later found that I could do it in pure Python as well. The code is posted below.
In the end, I will probably just wind up adding ULP differences to my bag of tricks.
It was interesting to see how many floating points are between what should be two equal numbers that never left memory. One of the articles or the google code I read said that 4 was a good number... but here I was able to hit 10.
>>> f1 = 25.4
>>> f2 = f1
>>>
>>> for i in xrange(1, 11):
... f2 /= 10.0 # to cm
... f2 *= (1.0 / 2.54) # to in
... f2 *= 25.4 # back to mm
... print 'after %2d loops there are %2d doubles between them' % (i, dulpdiff(f1, f2))
...
after 1 loops there are 1 doubles between them
after 2 loops there are 2 doubles between them
after 3 loops there are 3 doubles between them
after 4 loops there are 4 doubles between them
after 5 loops there are 6 doubles between them
after 6 loops there are 7 doubles between them
after 7 loops there are 8 doubles between them
after 8 loops there are 10 doubles between them
after 9 loops there are 10 doubles between them
after 10 loops there are 10 doubles between them
Also interesting is how many floating points there are between equal numbers when one of them is written out as a string and read back in.
>>> # 0 degrees Fahrenheit is -32 / 1.8 degrees Celsius
... f = -32 / 1.8
>>> s = str(f)
>>> s
'-17.7777777778'
>>> # floats between them...
... fulpdiff(f, float(s))
0
>>> # doubles between them...
... dulpdiff(f, float(s))
6255L
import struct
from functools import partial
# (c) 2010 Eric L. Frederich
#
# Python implementation of algorithms detailed here...
# from http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm
def c_mem_cast(x, f=None, t=None):
'''
do a c-style memory cast
In Python...
x = 12.34
y = c_mem_cast(x, 'd', 'l')
... should be equivilent to the following in c...
double x = 12.34;
long y = *(long*)&x;
'''
return struct.unpack(t, struct.pack(f, x))[0]
dbl_to_lng = partial(c_mem_cast, f='d', t='l')
lng_to_dbl = partial(c_mem_cast, f='l', t='d')
flt_to_int = partial(c_mem_cast, f='f', t='i')
int_to_flt = partial(c_mem_cast, f='i', t='f')
def ulp_diff_maker(converter, negative_zero):
'''
Getting the ulp difference of floats and doubles is similar.
Only difference if the offset and converter.
'''
def the_diff(a, b):
# Make a integer lexicographically ordered as a twos-complement int
ai = converter(a)
if ai < 0:
ai = negative_zero - ai
# Make b integer lexicographically ordered as a twos-complement int
bi = converter(b)
if bi < 0:
bi = negative_zero - bi
return abs(ai - bi)
return the_diff
# double ULP difference
dulpdiff = ulp_diff_maker(dbl_to_lng, 0x8000000000000000)
# float ULP difference
fulpdiff = ulp_diff_maker(flt_to_int, 0x80000000 )
# default to double ULP difference
ulpdiff = dulpdiff
ulpdiff.__doc__ = '''
Get the number of doubles between two doubles.
'''
这篇关于在Python和一般的浮点平等的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!