Python / numpy浮点文本精度 [英] Python/numpy floating-point text precision

查看:611
本文介绍了Python / numpy浮点文本精度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一些32位和64位的浮点值:

 >>> import numpy as np 
>>> v32 = np.array([5,0.1,2.4,4.555555555555555,12345678.92345678635],
dtype = np.float32)
>>> v64 = np.array([5,0.1,2.4,4.555555555555555,12345678.92345678635],
dtype = np.float64)

我想将这些值序列化为文本,而不会丢失精度(或者至少真的接近不会失去精度)。我想这样做的标准方式是 repr

 > ;>> map(repr,v32)
['5.0','0.1','2.4000001','4.5555553','12345679.0']
>>> map(repr,v64)
['5.0','0.10000000000000001','2.3999999999999999','4.5555555555555554',
'12345678.923456786']

但是我想使表示尽可能紧凑以最小化文件大小,所以如果像2.4这样的值被序列化了,而没有额外的小数,那就好了。是的,我知道这是他们实际的浮点表示,但是%g 似乎可以处理这个问题:

 >>> ('%.7g'* len(v32))%tuple(v32)
'5 0.1 2.4 4.555555 1.234568e + 07'
>>> ('%.16g'* len(v32))%tuple(v64)
'5 0.1 2.4 4.555555555555555 12345678.92345679'

我的问题是:以这种方式使用%g 是否安全?是 .7 .16 正确的值,这样精度不会丢失?

$ b Python 2.7和更高版本已经有一个漂亮的 repr 实现,它将0.1打印为 0.1 。摘要输出优先于其他候选项,例如 0.10000000000000001 ,因为它是该特定数字的最短表示,它往返于完全相同的浮动当读回到Python中的点值。要使用这个算法,在把它们交给 repr



<$ p之前,将你的64位浮点数转换为实际的Python浮点数$ p> >>> map(repr,map(float,v64))
['5.0','0.1','2.4','4.555555555555555','12345678.923456786']

令人惊讶的是,结果是自然的在数字上是正确的。有关2.7 / 3.2 repr 的更多信息可以在最新消息精彩的演讲

不幸的是,这个技巧不适用于32位的浮点数,至少不是没有重新实现Python 2.7的 repr使用的算法


Let's say I have some 32-bit and 64-bit floating point values:

>>> import numpy as np
>>> v32 = np.array([5, 0.1, 2.4, 4.555555555555555, 12345678.92345678635], 
                   dtype=np.float32)
>>> v64 = np.array([5, 0.1, 2.4, 4.555555555555555, 12345678.92345678635], 
                   dtype=np.float64)

I want to serialize these values to text without losing precision (or at least really close to not losing precision). I think the canonical way of doing this is with repr:

>>> map(repr, v32)
['5.0', '0.1', '2.4000001', '4.5555553', '12345679.0']
>>> map(repr, v64)
['5.0', '0.10000000000000001', '2.3999999999999999', '4.5555555555555554', 
 '12345678.923456786']

But I want to make the representation as compact as possible to minimize file size, so it would be nice if values like 2.4 got serialized without the extra decimals. Yes, I know that's their actual floating point representation, but %g seems to be able to take care of this:

>>> ('%.7g ' * len(v32)) % tuple(v32)
'5 0.1 2.4 4.555555 1.234568e+07 '
>>> ('%.16g ' * len(v32)) % tuple(v64)
'5 0.1 2.4 4.555555555555555 12345678.92345679 '

My question is: is it safe to use %g in this way? Are .7 and .16 the correct values so that precision won't be lost?

解决方案

Python 2.7 and later already have a smart repr implementation for floats that prints 0.1 as 0.1. The brief output is chosen in preference to other candidates such as 0.10000000000000001 because it is the shortest representation of that particular number that roundtrips to the exact same floating-point value when read back into Python. To use this algorithm, convert your 64-bit floats to actual Python floats before handing them off to repr:

>>> map(repr, map(float, v64))
['5.0', '0.1', '2.4', '4.555555555555555', '12345678.923456786']

Surprisingly, the result is natural-looking and numerically correct. More info on the 2.7/3.2 repr can be found in What's New and a fascinating lecture by Mark Dickinson.

Unfortunately, this trick won't work for 32-bit floats, at least not without reimplementing the algorithm used by Python 2.7's repr.

这篇关于Python / numpy浮点文本精度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆