如何截断浮点值? [英] How to truncate float values?

查看:73
本文介绍了如何截断浮点值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从浮点数中删除数字,以使小数点后的位数固定不变,例如:

I want to remove digits from a float to have a fixed number of digits after the dot, like:

1.923328437452 → 1.923

我需要作为字符串输出到另一个函数,而不是打印.

I need to output as a string to another function, not print.

我也想忽略丢失的数字,而不是四舍五入.

Also I want to ignore the lost digits, not round them.

推荐答案

首先,该函数用于那些只需要复制和粘贴代码的人:

First, the function, for those who just want some copy-and-paste code:

def truncate(f, n):
    '''Truncates/pads a float f to n decimal places without rounding'''
    s = '{}'.format(f)
    if 'e' in s or 'E' in s:
        return '{0:.{1}f}'.format(f, n)
    i, p, d = s.partition('.')
    return '.'.join([i, (d+'0'*n)[:n]])

这在Python 2.7和3.1+中有效.对于较旧的版本,不可能获得相同的智能舍入"效果(至少,并非没有很多复杂的代码),但是在截断前舍入到小数点后12位将在大多数时间起作用:

This is valid in Python 2.7 and 3.1+. For older versions, it's not possible to get the same "intelligent rounding" effect (at least, not without a lot of complicated code), but rounding to 12 decimal places before truncation will work much of the time:

def truncate(f, n):
    '''Truncates/pads a float f to n decimal places without rounding'''
    s = '%.12f' % f
    i, p, d = s.partition('.')
    return '.'.join([i, (d+'0'*n)[:n]])

说明

基础方法的核心是将值完全精确地转换为字符串,然后仅截取超出所需字符数的所有内容.后面的步骤很容易;可以通过字符串操作完成

Explanation

The core of the underlying method is to convert the value to a string at full precision and then just chop off everything beyond the desired number of characters. The latter step is easy; it can be done either with string manipulation

i, p, d = s.partition('.')
'.'.join([i, (d+'0'*n)[:n]])

decimal模块

str(Decimal(s).quantize(Decimal((0, (1,), -n)), rounding=ROUND_DOWN))

转换为字符串的第一步非常困难,因为存在一些成对的浮点文字(即,您在源代码中编写的内容),它们都产生相同的二进制表示形式,但应以不同的方式截断.例如,考虑0.3和0.29999999999999998.如果您在Python程序中编写0.3,则编译器将使用IEEE浮点格式将其编码为位序列(假定为64位浮点数)

The first step, converting to a string, is quite difficult because there are some pairs of floating point literals (i.e. what you write in the source code) which both produce the same binary representation and yet should be truncated differently. For example, consider 0.3 and 0.29999999999999998. If you write 0.3 in a Python program, the compiler encodes it using the IEEE floating-point format into the sequence of bits (assuming a 64-bit float)

0011111111010011001100110011001100110011001100110011001100110011

这是最接近0.3的值,可以准确地表示为IEEE浮点数.但是,如果您在Python程序中编写0.29999999999999998,则编译器会将其转换为完全相同的值.在一种情况下,您希望将其截断为(<一位)到0.3,而在另一种情况下,您希望将其截断为0.2,但是Python只能给出一个答案.这是Python的根本限制,或者实际上是任何没有懒惰求值的编程语言.截断功能只能访问存储在计算机内存中的二进制值,而不能访问您实际在源代码中键入的字符串. 1

This is the closest value to 0.3 that can accurately be represented as an IEEE float. But if you write 0.29999999999999998 in a Python program, the compiler translates it into exactly the same value. In one case, you meant it to be truncated (to one digit) as 0.3, whereas in the other case you meant it to be truncated as 0.2, but Python can only give one answer. This is a fundamental limitation of Python, or indeed any programming language without lazy evaluation. The truncation function only has access to the binary value stored in the computer's memory, not the string you actually typed into the source code.1

如果再次使用IEEE 64位浮点格式将位序列解码回十进制数,则会得到

If you decode the sequence of bits back into a decimal number, again using the IEEE 64-bit floating-point format, you get

0.2999999999999999888977697537484345957637...

因此即使0.2可能不是您想要的,它也会提供一个幼稚的实现.有关浮点表示错误的更多信息,请参见Python教程.

so a naive implementation would come up with 0.2 even though that's probably not what you want. For more on floating-point representation error, see the Python tutorial.

使用非常接近整数但又有意 不等于该整数的浮点值是非常罕见的.因此,在截断时,从所有可能对应于内存值的十进制"十进制表示中选择是最有意义的. Python 2.7及更高版本(但不是3.0)包含一个复杂的算法即可做到,我们可以通过以下方式访问默认的字符串格式设置操作.

It's very rare to be working with a floating-point value that is so close to a round number and yet is intentionally not equal to that round number. So when truncating, it probably makes sense to choose the "nicest" decimal representation out of all that could correspond to the value in memory. Python 2.7 and up (but not 3.0) includes a sophisticated algorithm to do just that, which we can access through the default string formatting operation.

'{}'.format(f)

唯一需要说明的是,如果数字足够大或足够小,它就使用指数表示法(1.23e+4),这就像g格式规范一样.因此,该方法必须抓住这种情况并以不同的方式处理它.在某些情况下,使用f格式规范会引起问题,例如尝试将3e-10截断为28位精度(产生0.0000000002999999999999999980),但我不确定如何最好地处理那些.

The only caveat is that this acts like a g format specification, in the sense that it uses exponential notation (1.23e+4) if the number is large or small enough. So the method has to catch this case and handle it differently. There are a few cases where using an f format specification instead causes a problem, such as trying to truncate 3e-10 to 28 digits of precision (it produces 0.0000000002999999999999999980), and I'm not yet sure how best to handle those.

如果实际上正在使用非常接近整数但有意不等于整数的float(例如0.29999999999999998或99.959999999999994),则会产生一些误报,即"ll舍入您不想舍入的数字.在这种情况下,解决方案是指定固定的精度.

If you actually are working with floats that are very close to round numbers but intentionally not equal to them (like 0.29999999999999998 or 99.959999999999994), this will produce some false positives, i.e. it'll round numbers that you didn't want rounded. In that case the solution is to specify a fixed precision.

'{0:.{1}f}'.format(f, sys.float_info.dig + n + 2)

这里使用的精度位数并不重要,它只需要足够大就可以确保在字符串转换中执行的任何舍入操作都不会将值累加"到其漂亮的十进制表示形式.我认为sys.float_info.dig + n + 2在所有情况下都足够,但如果不是,可能必须增加2,这样做没有什么害处.

The number of digits of precision to use here doesn't really matter, it only needs to be large enough to ensure that any rounding performed in the string conversion doesn't "bump up" the value to its nice decimal representation. I think sys.float_info.dig + n + 2 may be enough in all cases, but if not that 2 might have to be increased, and it doesn't hurt to do so.

在早期的Python版本(最高2.6或3.0)中,浮点数格式更加粗糙,并且会定期生成类似

In earlier versions of Python (up to 2.6, or 3.0), the floating point number formatting was a lot more crude, and would regularly produce things like

>>> 1.1
1.1000000000000001

如果这是您的情况,如果您 do 想要使用"nice"十进制表示法进行截断,那么据我所知,您所能做的就是选择一些位数,少于用float表示的全精度,并在舍入之前将数字四舍五入到许多位数.典型的选择是12,

If this is your situation, if you do want to use "nice" decimal representations for truncation, all you can do (as far as I know) is pick some number of digits, less than the full precision representable by a float, and round the number to that many digits before truncating it. A typical choice is 12,

'%.12f' % f

但是您可以调整它以适合您使用的数字.

but you can adjust this to suit the numbers you're using.

1 嗯...我撒了谎.从技术上讲,您可以 指示Python重新解析其自身的源代码,并提取与传递给截断函数的第一个参数相对应的部分.如果该参数是浮点文字,则可以将其在小数点后截取一定数量的位,然后将其返回.但是,如果参数是变量,则此策略不起作用,这使其相当无用.以下内容仅供参考:

1Well... I lied. Technically, you can instruct Python to re-parse its own source code and extract the part corresponding to the first argument you pass to the truncation function. If that argument is a floating-point literal, you can just cut it off a certain number of places after the decimal point and return that. However this strategy doesn't work if the argument is a variable, which makes it fairly useless. The following is presented for entertainment value only:

def trunc_introspect(f, n):
    '''Truncates/pads the float f to n decimal places by looking at the caller's source code'''
    current_frame = None
    caller_frame = None
    s = inspect.stack()
    try:
        current_frame = s[0]
        caller_frame = s[1]
        gen = tokenize.tokenize(io.BytesIO(caller_frame[4][caller_frame[5]].encode('utf-8')).readline)
        for token_type, token_string, _, _, _ in gen:
            if token_type == tokenize.NAME and token_string == current_frame[3]:
                next(gen) # left parenthesis
                token_type, token_string, _, _, _ = next(gen) # float literal
                if token_type == tokenize.NUMBER:
                    try:
                        cut_point = token_string.index('.') + n + 1
                    except ValueError: # no decimal in string
                        return token_string + '.' + '0' * n
                    else:
                        if len(token_string) < cut_point:
                            token_string += '0' * (cut_point - len(token_string))
                        return token_string[:cut_point]
                else:
                    raise ValueError('Unable to find floating-point literal (this probably means you called {} with a variable)'.format(current_frame[3]))
                break
    finally:
        del s, current_frame, caller_frame

将其通用化以处理您传入变量的情况似乎是一个失败的原因,因为您必须追溯到程序的执行过程,直到找到为变量赋值的浮点文字.如果有一个.大多数变量将从用户输入或数学表达式初始化,在这种情况下,二进制表示就全部存在.

Generalizing this to handle the case where you pass in a variable seems like a lost cause, since you'd have to trace backwards through the program's execution until you find the floating-point literal which gave the variable its value. If there even is one. Most variables will be initialized from user input or mathematical expressions, in which case the binary representation is all there is.

这篇关于如何截断浮点值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆