Python的二进制文件的阅读问题 [英] Python binary file reading problem

查看:142
本文介绍了Python的二进制文件的阅读问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图读取二进制文件(重新presents在Matlab矩阵)在Python。
但我无法读取文件和转换字节到正确的值。

I'm trying to read a binary file (which represents a matrix in Matlab) in Python. But I am having trouble reading the file and converting the bytes to the correct values.

二进制文件包含4字节数的序列组成。前两个数字是行数和列数分别。我的朋友给了我一个MATLAB函数他写道,这是否使用FWRITE。
我愿做这样的事情:

The binary file consists of a sequence of 4-byte numbers. The first two numbers are the number of rows and columns respectively. My friend gave me a Matlab function he wrote that does this using fwrite. I would like to do something like this:

f = open(filename, 'rb')
rows = f.read(4)
cols = f.read(4)
m = [[0 for c in cols] for r in rows]
r = c = 0
while True:
    if c == cols:
        r += 1
        c = 0
    num = f.read(4)
    if num:
        m[r][c] = num
        c += 1
    else:
        break

但每当我用f.read(4),我得到的东西像'\\ X00 \\ X00 \\ X00 \\ X04'(这个具体的例子应该重新present 4),我想不通转换入正确的号码(使用整型,十六进制或类似的东西不工作)。我偶然发现struct.unpack,但是这似乎并没有太大帮助。

But whenever I use f.read(4), I get something like '\x00\x00\x00\x04' (this specific example should represent a 4), and I can't figure out convert it into the correct number (using int, hex or anything like that doesn't work). I stumbled upon struct.unpack, but that didn't seem to help very much.

下面是一个例子矩阵,对应的二进制文件(如,当我使用Python功能f.read()无任何规模大小paramater读取整个文件中出现)的MATLAB函数为它创建的:

Here is an example matrix and the corresponding binary file (as it appears when I read the entire file using the python function f.read() without any size paramater) that the Matlab function created for it:

4     4     2     4
2     2     2     1
3     3     2     4
2     2     6     2

'\x00\x00\x00\x04\x00\x00\x00\x04@\x80\x00\x00@\x00\x00\x00@@\x00\x00@\x00\x00\x00@\x80\x00\x00@\x00\x00\x00@@\x00\x00@\x00\x00\x00@\x00\x00\x00@\x00\x00\x00@\x00\x00\x00@\xc0\x00\x00@\x80\x00\x00?\x80\x00\x00@\x80\x00\x00@\x00\x00\x00'

因此​​,第一个4字节,然后5-8字节都应该是4,因为矩阵是4×4。然后它应该是4,4,2,4,2,2,2,1等...

So the first 4 bytes and the 5th-8th bytes should both be 4, as the matrix is 4x4. and then it should be 4,4,2,4,2,2,2,1,etc...

谢谢你们!

推荐答案

我查了一下更多的问题,因为我从来没有使用结构之前,所以这是很好的学习活动。原来有几个曲折的存在 - 首先是号码不存储为4字节的整数但作为4字节浮动以big-endian形式。第二,如果你的例子是正确的,则矩阵没有被存储正如人们所期望 - 由行,但按列来代替。例如。它的输出像这样(伪code):

I looked a bit more in your problem, since I had never used struct before so it was good learning activity. Turns out there are couple of twists there - first the numbers are not stored as 4-byte integers but as 4-byte float in big-endian form. Second, if your example is correct, then the matrix was not stored as one would expect - by rows, but by columns instead. E.g. it was output like so (pseudocode):

for j in cols:
  for i in rows:
    write Aij to file

所以我不得不看完后转置的结果。下面是你需要给出的例子中,code:

So I had to transpose the result after reading. Here is the code that you need given the example:

import struct 

def readMatrix(f):
    rows, cols = struct.unpack('>ii',f.read(8))
    m = [ list(struct.unpack('>%df' % rows, f.read(4*rows)))
             for c in range(cols)
        ]
    # transpose result to return
    return zip(*m)

在这里,我们测试一下:

And here we test it:

>>> from StringIO import StringIO
>>> f = StringIO('\x00\x00\x00\x04\x00\x00\x00\x04@\x80\x00\x00@\x00\x00\x00@@\x00\x00@\x00\x00\x00@\x80\x00\x00@\x00\x00\x00@@\x00\x00@\x00\x00\x00@\x00\x00\x00@\x00\x00\x00@\x00\x00\x00@\xc0\x00\x00@\x80\x00\x00?\x80\x00\x00@\x80\x00\x00@\x00\x00\x00')
>>> mat = readMatrix(f)
>>> for row in mat:
...     print row
...     
(4.0, 4.0, 2.0, 4.0)
(2.0, 2.0, 2.0, 1.0)
(3.0, 3.0, 2.0, 4.0)
(2.0, 2.0, 6.0, 2.0)

这篇关于Python的二进制文件的阅读问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆