在python中读取二进制文件 [英] reading a binary file in python

查看:236
本文介绍了在python中读取二进制文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须在python中读取一个二进制文件.这首先是由Fortran 90程序以这种方式编写的:

I have to read a binary file in python. This is first written by a Fortran 90 program in this way:

open(unit=10,file=filename,form='unformatted')
write(10)table%n1,table%n2
write(10)table%nH
write(10)table%T2
write(10)table%cool
write(10)table%heat
write(10)table%cool_com
write(10)table%heat_com
write(10)table%metal
write(10)table%cool_prime
write(10)table%heat_prime
write(10)table%cool_com_prime
write(10)table%heat_com_prime
write(10)table%metal_prime
write(10)table%mu
if (if_species_abundances) write(10)table%n_spec
close(10)

我可以使用以下IDL代码轻松读取此二进制文件:

I can easily read this binary file with the following IDL code:

n1=161L
n2=101L
openr,1,file,/f77_unformatted
readu,1,n1,n2
print,n1,n2
spec=dblarr(n1,n2,6)
metal=dblarr(n1,n2)
cool=dblarr(n1,n2)
heat=dblarr(n1,n2)
metal_prime=dblarr(n1,n2)
cool_prime=dblarr(n1,n2)
heat_prime=dblarr(n1,n2)
mu  =dblarr(n1,n2)
n   =dblarr(n1)
T   =dblarr(n2)
Teq =dblarr(n1)
readu,1,n
readu,1,T
readu,1,Teq
readu,1,cool
readu,1,heat
readu,1,metal
readu,1,cool_prime
readu,1,heat_prime
readu,1,metal_prime
readu,1,mu
readu,1,spec
print,spec
close,1

我想做的是用Python读取此二进制文件.但是有一些问题. 首先,这是我尝试读取的文件:

What I want to do is reading this binary file with Python. But there are some problems. First of all, here is my attempt to read the file:

import numpy
from numpy import *
import struct

file='name_of_my_file'
with open(file,mode='rb') as lines:
    c=lines.read()

我尝试读取前两个变量:

I try to read the first two variables:

dummy, n1, n2, dummy = struct.unpack('iiii',c[:16])

但是正如您所看到的,我必须添加到虚拟变量,因为以某种方式,fortran程序在那些位置添加了整数8.

But as you can see I had to add to dummy variables because, somehow, the fortran programs add the integer 8 in those positions.

现在的问题是尝试读取其他字节时.我没有得到IDL计划的相同结果.

The problem is now when trying to read the other bytes. I don't get the same result of the IDL program.

这是我尝试读取数组n

Here is my attempt to read the array n

 double = 8
 end = 16+n1*double
 nH = struct.unpack('d'*n1,c[16:end])

但是,当我打印该数组时,我得到了无意义的值.我的意思是,我可以使用上面的IDL代码读取文件,所以我知道会发生什么.所以我的问题是:当我不完全了解结构时,如何读取该文件?为什么使用IDL如此简单地阅读它?我需要使用Python读取此数据集.

However, when I print this array I get non sense value. I mean, I can read the file with the above IDL code, so I know what to expect. So my question is: how can I read this file when I don't know exactly the structure? Why with IDL it is so simple to read it? I need to read this data set with Python.

推荐答案

您正在寻找的是

What you're looking for is the struct module.

该模块允许您从字符串中解压缩数据,将其视为二进制数据.

This module allows you to unpack data from strings, treating it like binary data.

您提供格式字符串和文件字符串,它将使用返回您的二进制对象的数据.

You supply a format string, and your file string, and it will consume the data returning you binary objects.

例如,使用您的变量:

import struct
content = f.read() #I'm not sure why in a binary file you were using "readlines",
                   #but if this is too much data, you can supply a size to read()
n, T, Teq, cool = struct.unpack("dddd",content[:32])

这将使n,T,Teq和cool保持二进制文件中的前四个双打.当然,这只是一个示范.您的示例看起来像想要双打列表-struct.unpack方便地返回一个元组,我认为您的情况仍然可以正常工作(如果没有,您可以列出它们).请记住,struct.unpack需要消耗传递给它的整个字符串-否则您将得到一个struct.error.因此,就像我在上面的评论中所说的那样,要么对输入字符串进行切片,要么仅对read个要使用的字符数进行切割.

This will make n, T, Teq, and cool hold the first four doubles in your binary file. Of course, this is just a demonstration. Your example looks like it wants lists of doubles - conveniently struct.unpack returns a tuple, which I take for your case will still work fine (if not, you can listify them). Keep in mind that struct.unpack needs to consume the whole string passed into it - otherwise you'll get a struct.error. So, either slice your input string, or only read the number of characters you'll use, like I said above in my comment.

例如,

n_content = f.read(8*number_of_ns) #8, because doubles are 8 bytes
n = struct.unpack("d"*number_of_ns,n_content)

这篇关于在python中读取二进制文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆