截断在Python中的花车 [英] Truncating floats in Python

查看:116
本文介绍了截断在Python中的花车的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  1.923328437452  -  

我想从一个浮点数中删除数字, > 1.923

我需要将字符串输出到另一个函数,而不是打印。
$ b

另外我想忽略丢失的数字,而不是将它们放在一起。

解决方案

首先,函数,为那些只需要一些复制粘贴代码:

  def truncate(f,n):$ b $格式(f)
如果'e'在s中或者'E'在s中:b'''将一个浮点数f截断/填充到n个小数位而不舍入'''
s ='{
return'{0:。{1} f}'。格式(f,n)
i,p,d = s.partition('。')
return'。'。join ([i,(d +'0'* n)[:n]])

在Python 2.7和3.1+。对于旧版本,不可能得到相同的智能四舍五入效果(至少不是没有很多复杂的代码),但在截断之前四舍五入到小数点后12位将大部分时间工作:

  def truncate(f,n):
'''将一个浮点f截断/ $ bs ='%.12f'%f
i,p,d = s.partition('。')
return'。'。join([i,(d +'0'* n)[ :n]])



说明



底层方法的核心是将该值完全精确地转换为字符串,然后删除超出所需字符数的所有内容。后一步很简单;它可以通过字符串操作完成

  i,p,d = s.partition('。')
'。'。join。[$,$($' (十进制),或十进制模块

  str量化(十进制((0,(1,),-n)),四舍五入= ROUND_DOWN))

第一步转换为字符串是非常困难的,因为有一些浮点文字对(也就是你在源代码中写的),它们都产生相同的二进制表示,但应该截然不同。例如,考虑0.3和0.29999999999999998。如果在Python程序中编写 0.3 ,编译器会使用IEEE浮点格式将其编码为位序列(假设为64位浮点数)

  0011111111010011001100110011001100110011001100110011001100110011 

这是0.3的最接近值,可以准确地表示为IEEE浮点数。但是如果在Python程序中编写 0.29999999999999998 ,编译器会将它转换成完全相同的值。在一种情况下,你的意思是它被截断(一位数)为 0.3 ,而在另一种情况下,你的意思是截断为 0.2 ,但是Python只能给出一个答案。这是Python的一个基本限制,或者是没有惰性评估的任何编程语言。截断函数只能访问存储在计算机内存中的二进制值,而不是实际输入到源代码中的字符串。



如果您再次使用IEEE 64位浮点格式将位序列解码为十进制数,则会得到

  0.2999999999999999888977697537484345957637 ... 

所以幼稚实施将拿出 0.2 即使这可能不是你想要的。有关浮点表示法错误的更多信息,请参阅Python教程



使用非常接近整数的浮点值非常罕见,但是故意 不等于该整数。所以在截断时,选择与内存中的值对应的最好的十进制表示可能是有意义的。 Python 2.7及更高版本(但不包括3.0)包含一个复杂的算法,我们可以通过它格式(f)
格式化操作默认的字符串格式化操作


  

唯一需要注意的是,它的行为就像 g 格式规范,指数表示法( 1.23e + 4 )如果数目足够大或足够小。所以该方法必须抓住这个案例,并以不同的方式处理。在某些情况下,使用 f 格式规范会导致一个问题,比如试图截断 3e-10 到28位精度(它产生 0.0000000002999999999999999980 ),我还不确定如何最好地处理这些。



<如果你真的 浮动 s是非常接近整数,但有意不等于他们(如0.29999999999999998或99.959999999999994),这会产生一些误报,即它会舍入你不想舍入的数字。在这种情况下,解决方案是指定一个固定的精度。

 '{0:。{1} f}'。format f,sys.float_info.dig + n + 2)

这里使用的精度位数并不重要,它只需要足够大,以确保在字符串转换中执行的任何舍入不会将该值碰到其良好的十进制表示形式。我认为 sys.float_info.dig + n + 2 在所有情况下都可以,但如果不是 2 必须增加,这样做并没有什么坏处。



在Python的早期版本(最高2.6或3.0)中,浮点数格式为更多的原油,并会定期产生的东西如

 >>> 1.1 
1.1000000000000001

如果这是您的情况,如果 想要使用好的十进制表示截断,所有你可以做的(据我所知)是挑选一些数字,比 float ,然后将数字四舍五入到许多数字,然后截断它。一个典型的选择是12,

 '%。12f'%f 

但是您可以调整它以适应您使用的数字。




1 呃...我说谎了。从技术上讲,您可以指示Python重新解析自己的源代码,并提取与传递给截断函数的第一个参数相对应的部分。如果这个参数是一个浮点数字,你可以把它从小数点后面的一定数量的地方剪下来并返回。然而,如果参数是一个变量,这个策略是行不通的,这使得它变得毫无用处。以下仅供娱乐价值:
$ b $ pre $ def trunc_introspect(f,n):
'''截断/通过查看调用者的源代码来将float f填充到n个小数位
current_frame = None
caller_frame = None
s = inspect.stack()
try:
current_frame = s [0]
caller_frame = s [1]
gen = tokenize.tokenize(io.BytesIO(caller_frame [4] [caller_frame [5]]。encode('utf-8' ))。readline)
为token_type,token_string,_,_,_ gen:
如果token_type == tokenize.NAME和token_string == current_frame [3]:
next(gen) #左括号
token_type,token_string,_,_,_ = next(gen)#float literal
如果token_type == tokenize.NUMBER:
try:
cut_point = token_string。 index('。')+ n + 1
,除了ValueError:#no dec imal in string
return token_string +'。'+'0'* n
else:
if len(token_string)< cut_point:
token_string + ='0'*(cut_point - len(token_string))
返回token_string [:cut_point]
else:
raise ValueError('Unable to find floating- (current_frame [3]))
break
finally:
del s,current_frame,caller_frame

一般来说,在传递变量的情况下,这似乎是一个失败的原因,因为您必须向后追溯通过程序的执行,直到找到赋予变量值的浮点文字。如果有的话。大多数变量将从用户输入或数学表达式初始化,在这种情况下,二进制表示全部存在。


I want to remove digits from a float to have a fixed number of digits after the dot, like:

1.923328437452 -> 1.923

I need to output as a string to another function, not print.

Also I want to ignore the lost digits, not round them.

解决方案

First, the function, for those who just want some copy-and-paste code:

def truncate(f, n):
    '''Truncates/pads a float f to n decimal places without rounding'''
    s = '{}'.format(f)
    if 'e' in s or 'E' in s:
        return '{0:.{1}f}'.format(f, n)
    i, p, d = s.partition('.')
    return '.'.join([i, (d+'0'*n)[:n]])

This is valid in Python 2.7 and 3.1+. For older versions, it's not possible to get the same "intelligent rounding" effect (at least, not without a lot of complicated code), but rounding to 12 decimal places before truncation will work much of the time:

def truncate(f, n):
    '''Truncates/pads a float f to n decimal places without rounding'''
    s = '%.12f' % f
    i, p, d = s.partition('.')
    return '.'.join([i, (d+'0'*n)[:n]])

Explanation

The core of the underlying method is to convert the value to a string at full precision and then just chop off everything beyond the desired number of characters. The latter step is easy; it can be done either with string manipulation

i, p, d = s.partition('.')
'.'.join([i, (d+'0'*n)[:n]])

or the decimal module

str(Decimal(s).quantize(Decimal((0, (1,), -n)), rounding=ROUND_DOWN))

The first step, converting to a string, is quite difficult because there are some pairs of floating point literals (i.e. what you write in the source code) which both produce the same binary representation and yet should be truncated differently. For example, consider 0.3 and 0.29999999999999998. If you write 0.3 in a Python program, the compiler encodes it using the IEEE floating-point format into the sequence of bits (assuming a 64-bit float)

0011111111010011001100110011001100110011001100110011001100110011

This is the closest value to 0.3 that can accurately be represented as an IEEE float. But if you write 0.29999999999999998 in a Python program, the compiler translates it into exactly the same value. In one case, you meant it to be truncated (to one digit) as 0.3, whereas in the other case you meant it to be truncated as 0.2, but Python can only give one answer. This is a fundamental limitation of Python, or indeed any programming language without lazy evaluation. The truncation function only has access to the binary value stored in the computer's memory, not the string you actually typed into the source code.1

If you decode the sequence of bits back into a decimal number, again using the IEEE 64-bit floating-point format, you get

0.2999999999999999888977697537484345957637...

so a naive implementation would come up with 0.2 even though that's probably not what you want. For more on floating-point representation error, see the Python tutorial.

It's very rare to be working with a floating-point value that is so close to a round number and yet is intentionally not equal to that round number. So when truncating, it probably makes sense to choose the "nicest" decimal representation out of all that could correspond to the value in memory. Python 2.7 and up (but not 3.0) includes a sophisticated algorithm to do just that, which we can access through the default string formatting operation.

'{}'.format(f)

The only caveat is that this acts like a g format specification, in the sense that it uses exponential notation (1.23e+4) if the number is large or small enough. So the method has to catch this case and handle it differently. There are a few cases where using an f format specification instead causes a problem, such as trying to truncate 3e-10 to 28 digits of precision (it produces 0.0000000002999999999999999980), and I'm not yet sure how best to handle those.

If you actually are working with floats that are very close to round numbers but intentionally not equal to them (like 0.29999999999999998 or 99.959999999999994), this will produce some false positives, i.e. it'll round numbers that you didn't want rounded. In that case the solution is to specify a fixed precision.

'{0:.{1}f}'.format(f, sys.float_info.dig + n + 2)

The number of digits of precision to use here doesn't really matter, it only needs to be large enough to ensure that any rounding performed in the string conversion doesn't "bump up" the value to its nice decimal representation. I think sys.float_info.dig + n + 2 may be enough in all cases, but if not that 2 might have to be increased, and it doesn't hurt to do so.

In earlier versions of Python (up to 2.6, or 3.0), the floating point number formatting was a lot more crude, and would regularly produce things like

>>> 1.1
1.1000000000000001

If this is your situation, if you do want to use "nice" decimal representations for truncation, all you can do (as far as I know) is pick some number of digits, less than the full precision representable by a float, and round the number to that many digits before truncating it. A typical choice is 12,

'%.12f' % f

but you can adjust this to suit the numbers you're using.


1Well... I lied. Technically, you can instruct Python to re-parse its own source code and extract the part corresponding to the first argument you pass to the truncation function. If that argument is a floating-point literal, you can just cut it off a certain number of places after the decimal point and return that. However this strategy doesn't work if the argument is a variable, which makes it fairly useless. The following is presented for entertainment value only:

def trunc_introspect(f, n):
    '''Truncates/pads the float f to n decimal places by looking at the caller's source code'''
    current_frame = None
    caller_frame = None
    s = inspect.stack()
    try:
        current_frame = s[0]
        caller_frame = s[1]
        gen = tokenize.tokenize(io.BytesIO(caller_frame[4][caller_frame[5]].encode('utf-8')).readline)
        for token_type, token_string, _, _, _ in gen:
            if token_type == tokenize.NAME and token_string == current_frame[3]:
                next(gen) # left parenthesis
                token_type, token_string, _, _, _ = next(gen) # float literal
                if token_type == tokenize.NUMBER:
                    try:
                        cut_point = token_string.index('.') + n + 1
                    except ValueError: # no decimal in string
                        return token_string + '.' + '0' * n
                    else:
                        if len(token_string) < cut_point:
                            token_string += '0' * (cut_point - len(token_string))
                        return token_string[:cut_point]
                else:
                    raise ValueError('Unable to find floating-point literal (this probably means you called {} with a variable)'.format(current_frame[3]))
                break
    finally:
        del s, current_frame, caller_frame

Generalizing this to handle the case where you pass in a variable seems like a lost cause, since you'd have to trace backwards through the program's execution until you find the floating-point literal which gave the variable its value. If there even is one. Most variables will be initialized from user input or mathematical expressions, in which case the binary representation is all there is.

这篇关于截断在Python中的花车的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆