使用标准库在 Python 2.x 中将字节数组转换为 Int [英] Byte array to Int in Python 2.x using Standard Libraries
问题描述
我对 Python 3.x 和使用 int.from bytes() 的 Bytearray 到 Decimal 的转换感到满意.可以提出以下转换代码段.有没有办法使用 Python 2 为正整数和负整数实现相同的功能.
val = bytearray(b'\x8f\x0f\xfd\x02\xf4\x95s\x00\x00')a = int.from_bytes(val, byteorder='big', signed=True)# 打印(类型(a),类型(val),val,a)# <class 'int'><类'bytearray'>bytearray(b'\x8f\x0f\xfd\x02\xf4\x95s\x00\x00') -2083330000000000000000
需要使用 Python 2.7 标准库将字节数组转换为 Int.
例如.bytearray(b'\x00')-->预期结果:0bytearray(b'\xef\xbc\xa9\xe5w\xd6\xd0\x00\x00') -->预期结果:-300000000000000000000bytearray(b'\x10CV\x1a\x88)0\x00\x00') -->预期结果:300000000000000000000
Python 2.7 中没有内置函数来完成 3.2+ 中的 int.from_bytes
等价物;这就是为什么首先添加该方法的原因.
如果你不关心处理大端有符号整数以外的任何情况,并且关心可读性而不是性能(所以你可以扩展它或自己维护它),最简单的解决方案可能是一个显式循环字节.
<小时>对于未签名,这很容易:
n = 0对于在 b 中:n = n * 256 +
<小时>
但是要处理负数,您需要做三件事:
- 去掉最高字节的符号位.由于我们只关心 big-endian,这是
b[0]
上的0x80
位. - 这使得空字节数组成为一种特殊情况,所以要特别处理.
- 最后,如果设置了符号位,则对结果进行 2 的补码.
所以:
def int_from_bytes(b):'''将大端有符号整数字节数组转换为intint_from_bytes(b) == int.from_bytes(b, 'big', signed=True)'''如果不是 b: # 特殊情况 0 以避免 b[0] 提高返回 0n = b[0] &0x7f # 跳过符号位对于 b[1:] 中的 by:n = n * 256 + 由如果 b[0] &0x80: # 如果设置了符号位,则为 2 的补码位 = 8*len(b)偏移量 = 2**(bits-1)返回 n - 偏移量别的:返回 n
(这适用于任何可迭代的整数.在 Python 3 中,包括 bytes
和 bytearray
;在 Python 2 中,它包括 bytearray
> 但不是 str
.)
在 Python 3 中测试您的输入:
<预><代码>>>>对于 b in (bytearray(b'\x8f\x0f\xfd\x02\xf4\x95s\x00\x00'),... bytearray(b'\x00'),... bytearray(b'\xef\xbc\xa9\xe5w\xd6\xd0\x00\x00'),... bytearray(b'\x10CV\x1a\x88)0\x00\x00')):...打印(int.from_bytes(b,'big',signed=True),int_from_bytes(b))-2083330000000000000000 -20833300000000000000000 0-300000000000000000000 -300000000000000000000300000000000000000000 300000000000000000000在 Python 2 中:
<预><代码>>>>对于 b in (bytearray(b'\x8f\x0f\xfd\x02\xf4\x95s\x00\x00'),... bytearray(b'\x00'),... bytearray(b'\xef\xbc\xa9\xe5w\xd6\xd0\x00\x00'),... bytearray(b'\x10CV\x1a\x88)0\x00\x00')):... 打印 int_from_bytes(b)-20833300000000000000000-300000000000000000000300000000000000000000<小时>
如果这是一个瓶颈,几乎肯定有更快的方法来做到这一点.例如,可能通过 gmpy2
.事实上,即使将字节转换为十六进制字符串和 unhexlifying 可能会更快,即使它是工作的两倍多,如果你能找到一种方法将这些主循环从 Python 移动到 C.或者你可以合并结果一次对 8 个字节调用 struct.unpack_from
而不是一个一个地处理每个字节.但是这个版本应该易于理解和维护,并且不需要标准库之外的任何东西.
I am comfortable in Python 3.x and Bytearray to Decimal conversion using int.from bytes(). Could come up with the below conversion snippet. Is there a way to achieve the same functionality using Python 2 for positive and negative integers.
val = bytearray(b'\x8f\x0f\xfd\x02\xf4\x95s\x00\x00')
a = int.from_bytes(val, byteorder='big', signed=True)
# print(type(a), type(val), val, a)
# <class 'int'> <class 'bytearray'> bytearray(b'\x8f\x0f\xfd\x02\xf4\x95s\x00\x00') -2083330000000000000000
Need to use Python 2.7 standard libraries to convert byte array to Int.
Eg. bytearray(b'\x00')--> Expected Result: 0
bytearray(b'\xef\xbc\xa9\xe5w\xd6\xd0\x00\x00') --> Expected Result: -300000000000000000000
bytearray(b'\x10CV\x1a\x88)0\x00\x00') --> Expected Result: 300000000000000000000
There is no built-in function in Python 2.7 to do the equivalent of int.from_bytes
in 3.2+; that's why the method was added in the first place.
If you don't care about handling any cases other than big-endian signed ints, and care about readability more than performance (so you can extend it or maintain it yourself), the simplest solution is probably an explicit loop over the bytes.
For unsigned, this would be easy:
n = 0
for by in b:
n = n * 256 + by
But to handle negative numbers, you need to do three things:
- Take off the sign bit from the highest byte. Since we only care about big-endian, this is the
0x80
bit onb[0]
. - That makes an empty bytearray a special case, so handle that specially.
- At the end, if the sign bit was set, 2's-complement the result.
So:
def int_from_bytes(b):
'''Convert big-endian signed integer bytearray to int
int_from_bytes(b) == int.from_bytes(b, 'big', signed=True)'''
if not b: # special-case 0 to avoid b[0] raising
return 0
n = b[0] & 0x7f # skip sign bit
for by in b[1:]:
n = n * 256 + by
if b[0] & 0x80: # if sign bit is set, 2's complement
bits = 8*len(b)
offset = 2**(bits-1)
return n - offset
else:
return n
(This works on any iterable of ints. In Python 3, that includes both bytes
and bytearray
; in Python 2, it includes bytearray
but not str
.)
Testing your inputs in Python 3:
>>> for b in (bytearray(b'\x8f\x0f\xfd\x02\xf4\x95s\x00\x00'),
... bytearray(b'\x00'),
... bytearray(b'\xef\xbc\xa9\xe5w\xd6\xd0\x00\x00'),
... bytearray(b'\x10CV\x1a\x88)0\x00\x00')):
... print(int.from_bytes(b, 'big', signed=True), int_from_bytes(b))
-2083330000000000000000 -2083330000000000000000
0 0
-300000000000000000000 -300000000000000000000
300000000000000000000 300000000000000000000
And in Python 2:
>>> for b in (bytearray(b'\x8f\x0f\xfd\x02\xf4\x95s\x00\x00'),
... bytearray(b'\x00'),
... bytearray(b'\xef\xbc\xa9\xe5w\xd6\xd0\x00\x00'),
... bytearray(b'\x10CV\x1a\x88)0\x00\x00')):
... print int_from_bytes(b)
-2083330000000000000000
0
-300000000000000000000
300000000000000000000
If this is a bottleneck, there are almost surely faster ways to do this. Maybe via gmpy2
, for example. In fact, even converting the bytes to a hex string and unhexlifying might be faster, even though it's more than twice the work, if you can find a way to move those main loops from Python to C. Or you could merge up the results of calling struct.unpack_from
on 8 bytes at a time instead of handling each byte one by one. But this version should be easy to understand and maintain, and doesn't require anything outside the stdlib.
这篇关于使用标准库在 Python 2.x 中将字节数组转换为 Int的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!