在Python中读取直接访问二进制文件格式 [英] Reading direct access binary file format in Python

查看:119
本文介绍了在Python中读取直接访问二进制文件格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

背景:

使用以下Fortran代码在Linux机器上读取二进制文件:

A binary file is read on a Linux machine using the following Fortran code:

        parameter(nx=720, ny=360, nday=365)
c 
        dimension tmax(nx,ny,nday),nmax(nx,ny,nday)
        dimension tmin(nx,ny,nday),nmin(nx,ny,nday)
c 
        open(10,
     &file='FILE',
     &access='direct',recl=nx*ny*4)
c
        do k=1,nday
        read(10,rec=(k-1)*4+1)((tmax(i,j,k),i=1,nx),j=1,ny) 
        read(10,rec=(k-1)*4+2)((nmax(i,j,k),i=1,nx),j=1,ny) 
        read(10,rec=(k-1)*4+3)((tmin(i,j,k),i=1,nx),j=1,ny) 
        read(10,rec=(k-1)*4+4)((nmin(i,j,k),i=1,nx),j=1,ny) 
        end do

文件详细信息:

options  little_endian
title global daily analysis (grid box mean, the grid shown is the center of the grid box)
undef -999.0
xdef 720 linear    0.25 0.50
ydef 360  linear -89.75 0.50
zdef 1 linear 1 1
tdef 365 linear 01jan2015 1dy
vars 4
tmax     1  00 daily maximum temperature (C)
nmax     1  00 number of reports for maximum temperature (C)
tmin     1  00 daily minimum temperature (C)
nmin     1  00 number of reports for minimum temperature (C)
ENDVARS

尝试解决方案:

我正在尝试使用以下代码(故意省略两个属性)将其解析为python中的数组:

I am trying to parse this into an array in python using the following code (purposely leaving out two attributes):

with gzip.open("/FILE.gz", "rb") as infile:
     data = numpy.frombuffer(infile.read(), dtype=numpy.dtype('<f4'), count = -1)

while x <= len(data) / 4:
    tmax.append(data[(x-1)*4])
    tmin.append(data[(x-1)*4 + 2])
    x += 1

data_full = zip(tmax, tmin)

在测试某些记录时,使用Fortran时,数据似乎与文件中的某些示例记录不一致.我也尝试了dtype=numpy.float32,但没有成功.似乎我从观察次数的角度正确读取了文件.在了解使用Fortran创建文件之前,我还使用了struct.那没用

When testing some records, the data does not seem to line up with some sample records from the file when using Fortran. I have also tried dtype=numpy.float32 as well with no success. It seems as though I am reading the file in correctly in terms of number of observations though. I was also using struct before I learned the file was created with Fortran. That was not working

这里也有类似的问题,其中一些问题是我尝试运气不佳的答案.

There are similar questions out here, some of which have answers that I have tried adapting with no luck.

更新

尝试了以下代码后,我有点靠近了:

I am a little bit closer after trying out this code:

#Define numpy variables and empty arrays
nx = 720 #number of lon
ny = 360 #number of lat
nday = 0 #iterate up to 364 (or 365 for leap year)   
tmax = numpy.empty([0], dtype='<f', order='F')
tmin = numpy.empty([0], dtype='<f', order='F')

#Parse the data into numpy arrays, shifting records as the date increments
while nday < 365:
    tmax = numpy.append(tmax, data[(nx*ny)*nday:(nx*ny)*(nday + 1)].reshape((nx,ny), order='F'))
    tmin = numpy.append(tmin, data[(nx*ny)*(nday + 2):(nx*ny)*(nday + 3)].reshape((nx,ny), order='F'))
    nday += 1  

第一天我得到了正确的数据,但是第二天我得到了全零,第三天我的最大值低于最小值,依此类推.

I get the correct data for the first day, but for the second day I get all zeros, the third day the max is lower than the min, and so on.

推荐答案

在我的问题中进行更新后,我意识到我在循环方式方面存在错误.我当然会在发放赏金大约10分钟后发现这一点,很好.

After the Update in my question, I realize I had an error with how I was looping. I of course spotted this about 10 minutes after issuing a bounty, aw well.

错误在于使用日期来遍历记录.这将不起作用,因为每个循环迭代一次,没有将记录推得足够远.因此,为什么有些分钟高于最大值.新的代码是:

The error is with using the day to iterate through the records. This will not work as it iterates once per loop, not pushing the records far enough. Hence why some mins were higher than maxes. The new code is:

while nday < 365:
    tmax = numpy.append(tmax, data[(nx*ny)*rm:(nx*ny)*(rm + 1)].reshape((nx,ny), order='F'))
    rm = rm + 2
    tmin = numpy.append(tmin, data[(nx*ny)*rm:(nx*ny)*(rm + 1)].reshape((nx,ny), order='F'))
    rm = rm + 2
    nday += 1 

这使用了记录移动器(或我所说的rm)将记录移动适当的数量.那就足够了.

This used a Record Mover (or rm as I call it) to move the records the appropriate amount. That was all it needed.

这篇关于在Python中读取直接访问二进制文件格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆