Python读取二进制文件并解码 [英] Python read a binary file and decode
问题描述
我对python很陌生,我需要解决这个简单的问题.已经有几个类似的问题,但我仍然无法解决.
我需要读取一个由几个字节块组成的二进制文件.例如,标头由 6 个字节组成,我想提取这 6 个字节并转换二进制字符的 ins 序列,例如 000100110 011001.
I am quite new in python and I need to solve this simple problem. Already there are several similar questions but still I cannot solve it.
I need to read a binary file, which is composed by several blocks of bytes. For example the header is composed by 6 bytes and I would like to extract those 6 bytes and transform ins sequence of binary characters like 000100110 011001 for example.
navatt_dir='C:/PROCESSING/navatt_read/'
navatt_filename='OSPS_FRMT_NAVATT____20130621T100954_00296_caseB.bin'
navatt_path=navatt_dir+navatt_filename
navatt_file=open(navatt_path, 'rb')
header=list(navatt_file.read(6))
print header
作为列表的结果,我有以下内容
As result of the list i have the following
%run C:/PROCESSING/navatt_read/navat_read.py
['\t', 'i', '\xc0', '\x00', '\x00', 't']
这不是我想要的.
我还想在知道位置和长度的二进制文件中读取特定值,而无需读取所有文件.有没有可能
谢谢
which is not what i want.
I would like also to read a particular value in the binary file knowing the position and the length, without reading all the file. IS it possible
thanks
推荐答案
ByteArray
A bytearray是一个可变的字节序列(整数,其中 0 ≤ x ≤ 255).您可以从字符串(如果它不是字节字符串,则必须提供编码)、字节大小的整数的可迭代对象或具有缓冲区接口的对象构造字节数组.您当然也可以手动构建它.
ByteArray
A bytearray is a mutable sequence of bytes (Integers where 0 ≤ x ≤ 255). You can construct a bytearray from a string (If it is not a byte-string, you will have to provide encoding), an iterable of byte-sized integers, or an object with a buffer interface. You can of course just build it manually as well.
一个使用字节串的例子:
An example using a byte-string:
string = b'DFH'
b = bytearray(string)
# Print it as a string
print b
# Prints the individual bytes, showing you that it's just a list of ints
print [i for i in b]
# Lets add one to the D
b[0] += 1
# And print the string again to see the result!
print b
结果:
DFH
[68, 70, 72]
EFH
如果您想要原始字节操作,这就是您想要的类型.如果您想要将 4 个字节作为 32 位 int 读取,可以使用 struct 模块和 unpack 方法,但我通常只是自己从 bytearray 中将它们移到一起.
This is the type you want if you want raw byte manipulation. If what you want is to read 4 bytes as a 32bit int, one would use the struct module, with the unpack method, but I usually just shift them together myself from a bytearray.
您似乎想要的是获取您拥有的字符串,将其转换为字节数组,然后将它们打印为基数为 2/二进制的字符串.
What you seem to want is to take the string you have, convert it to a bytearray, and print them as a string in base 2/binary.
这里是一个关于如何写出标题的简短示例(我从名为dump"的文件中读取随机数据):
So here is a short example for how to write the header out (I read random data from a file named "dump"):
with open('dump', 'rb') as f:
header = f.read(6)
b = bytearray(header)
print ' '.join([bin(i)[2:].zfill(8) for i in b])
在将其转换为字节数组后,我对每个数组都调用 bin(),它返回一个具有我们需要的二进制表示形式的字符串,格式为0b1010".我不想要0b",所以我用 [2:] 把它切掉.然后,我使用字符串方法 zfill,它允许我在字符串为 8 长(这是我们需要的位数)之前添加所需数量的 0,因为 bin 不会显示任何不需要的零.
After converting it to a bytearray, I call bin() on every single one, which gives back a string with the binary representation we need, in the format of "0b1010". I don't want the "0b", so I slice it off with [2:]. Then, I use the string method zfill, which allows me to have the required amount of 0's prepended for the string to be 8 long (which is the amount of bits we need), as bin will not show any unneeded zeroes.
如果您不熟悉该语言,最后一行可能看起来很刻薄.它使用列表理解将我们要打印的所有二进制字符串列成一个列表,然后将它们连接到最终字符串中,元素之间有空格.
If you're new to the language, the last line might look quite mean. It uses list comprehension to make a list of all the binary strings we want to print, and then join them into the final string with spaces between the elements.
最后一行的一个不那么pythonic/复杂的变体是:
A less pythonic/convoluted variant of the last line would be:
result = []
for byte in b:
string = bin(i)[2:] # Make a binary string and slice the first two bytes
result.append(string.zfill(8)) # Append a 0-padded version to the results list
# Join the array to a space separated string and print it!
print ' '.join(result)
我希望这会有所帮助!
这篇关于Python读取二进制文件并解码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!