如何将fortran 77未格式化的二进制文件读入python [英] How to read fortran 77 unformatted binary file into python

查看:332
本文介绍了如何将fortran 77未格式化的二进制文件读入python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在IDL中有一个脚本读取未格式化的二进制文件(F77)并将其输出为.sav文件,但我想将此脚本转换为python并保存为.npz文件,并且在读取时遇到问题。

IDL代码:

 ;为QBO创建保存文件模型输出

; --------------------------------------- ------------------------------
;在此处输入数据(仅调整此信息)
; - -------------------------------------------------- ------------------

time = 7300; fortran代码的持续时间
tstep = 5; fortran代码的时间步数
inputfile ='/ home / cwilliams / Metr_205 / qbo / uvwtom。';来自fortran的二进制文件
outputsavefile ='〜cwilliams / Metr_205 / qbo / qbo_FULLX.sav';保存带变量的文件的名称
; --------------------------------------- -----------------------------------------
; ---- -------------------------------------------------- --------------------------





; ================================================= =============================
;不要调整以下代码
; ======================================= =======================================
fname ='dat'
nstep = time / tstep; time / timestep
nm = nstep-1

um = fltarr(17,34)
vm = fltarr(17,34)
wm = fltarr(17,34)
tm = fltarr(17,34)
pm = fltarr(17,34)
om = fltarr(17,34)
vpm = fltarr(17,34)
vdm = fltarr(17,34)


u = fltarr(17,34,nstep)
v = fltarr(17, 34,nstep)
w = fltarr(17,34,nstep)
temp = fltarr(17,34,nstep)
压力= fltarr(17,34,nstep)
臭氧= fltarr(17,34,nstep)

date = findgen(nstep)*(2. * tstep / 365。)
ulevels2 = [ - 30,-25,-20, - 15,-10,-5,5,10,15,20,25,30]
ulevels3 = findgen(15)* 3. -12
lab = findgen(20)* 0 + 1
lat = findgen(17)* 2.7-1.35

ht = findgen(34)* 0.5 + 17。


openr,37,inputfile + fname,/ f77_unformatted& title ='base'
for n = 0,nm do begin
readu,37,um,vm,wm,tm,om,pm
u(*,*,n)= um / 100。
v(*,*,n)= vm
w(*,*,n)= wm
temp(*,*,n)= tm * 700000 / 2.87e6
压力(*,*,N)= PM / 10000。
ozone(*,*,n)= om
endfor
close,37

save,filename = outputsavefile,u,v,w,temp,pressure,臭氧,日期,ulevels2,ulevels3,实验室,$
lat,ht
stop
end

Python代码:
#为QBO模型输出创建保存文件

 #--- -------------------------------------------------- ---------------- 
#在此处输入数据(仅调整此信息)
#--------------- -------------------------------------------------- ----

time = 7300#fortran fortduran代码
tstep = 5#fortran代码的时间步骤
inputfile ='/ home / cwilliams / metr51 / lab12 / uvwtom .dat'#fortran $二进制文件b $ b outputsavefile ='〜cwilliams / metr51 / lab12 / qbo_FULLX.sav'#变量保存文件的名称
#------------ -------------------------------------------------- ------------------
#--------------------------- -------------------------------------------------- ---


导入numpy为n


#== ================================================== ==========================
#不要调整以下代码
#======== ================================================== ====================
nstep = time / tstep#time / timestep
nm = nstep-1

um = n.empty((17,34))
vm = n.empty((17,34))
wm = n.empty((17,34))
tm = n.empty((17,34))
pm = n.empty((17,34))
om = n.empty((17,34))
vpm = n。 ((17,34))
vdm = n.empty((17,34))


u = n.empty((17,34,nstep))
v = n.empty((17,34,nstep))
w = n.empty((17,34,nstep))
temp = n.empty((17,34,nstep) )
pressure = n.empty((17,34,nstep))
ozone = n.empty((17,34,nstep))

date = n.arange (nstep)*(2. * tstep / 365。)
ulevels2 = n.array([ - 30,-25,-20,-15,-10,-5,5,10,15,20, (17)25,30])
ulevels3 = n.arange(15)* 3. -12
lab = n.arange(20)* 0 + 1
lat = n.arange(17) * 2.7-1.35

ht = n.arange(34)* 0.5 + 17。

这是我需要一些帮助的地方:

  f = open(inputfile,'rb')
data = f.read()
为范围内的我(nm + 1):
#readu,37,um,vm,wm,tm,om,pm
#u(:,:,i)= um / 100。
#v(:::i)= vm
#w(:,:,i)= wm
#temp(:,:,i)= tm * 700000. / 2.87 e6
#压力(:,:,i)= pm / 10000。
#ozone(*,*,n)= om

#n.savez(filename = outputsavefile,u = u,v = v,w = w,temp = temp,pressure =压力,臭氧=臭氧,日期=日期,ulevels2 = ulevels2,ulevels3 = ulevels3,lab = lab,lat = lat,ht = ht)

我知道IDL和Python之间的行/列顺序存在问题,但我认为一旦我读取代码就可以解决此问题。

sequential 访问而不是 direct 访问来编写原始Fortran文件这是一个非常重要的区别。我知道IDL,并且您在示例中设置的读取命令与顺序访问一致。



这很重要,因为Fortran顺序访问将记录标记添加到文件中对于每次调用WRITE(),所以你必须在你的Python读例程中说明这一点,就像乔治在注释中链接的答案一样。



至于行/列顺序,真正重要的是哪个维度迭代内存中最快的。例如,Fortran二维数组将在内存中写入第一个数组维度作为最快的迭代。当你将同样的东西读入一个2D Python变量时,它将成为最后一个维度 - 在重塑一个平面数组时,记住这一点,如果要保持索引,就必须进行转置。



我认为下面的代码可以做你想要的东西,为简洁起见,我只包含了你的um和vm变量:

  import numpy as np 

um = np.empty((34,17),dtype ='float32')#使这些尺寸倒退以便于重塑
vm = np.empty((34,17),dtype ='float32')#也注意,我相信默认类型是float64

f = open(inputfile,'rb')
recl = np.zeros(1,dtype = np.uint32)
代表范围内的i(nm + 1):
recl = np.fromfile(f,dtype ='uint32',count = 1)
tmpu = np.fromfile(f,dtype ='float32',count = um.size)#这些数组是平的
tmpv = np.fromfile(f,dtype ='float32', count = vm.size)
recl = np.fromfile(f,dtype ='uint32',count = 1)

um = np.transpose(np.reshape(tmpu,um.shape))
vm = np.transpose(np.reshape(tmpv,vm.shape))

然后你也可以在你的其他变量的循环中做任何你想做的事情,并且你应该能够象你在IDL中习惯的那样对um和vm进行索引或Fortran。这可能不是最简单或最好的方式,但我认为它至少会让你开始。


I have a script in IDL which reads an unformatted binary file (F77) and outputs it as a .sav file but I want to convert this script to python and save as an .npz file and I am having trouble on the read line.

IDL Code:

;create save file for QBO model output

;---------------------------------------------------------------------
;input data here (Only adjust this info)
;---------------------------------------------------------------------

  time=7300  ;duration from fortran code
  tstep=5   ;time step from fortran code
  inputfile='/home/cwilliams/Metr_205/qbo/uvwtom.'; binary file from fortran
  outputsavefile='~cwilliams/Metr_205/qbo/qbo_FULLX.sav'; name of save file with     variables
;--------------------------------------------------------------------------------
;--------------------------------------------------------------------------------





;==============================================================================
;   Do Not Adjust the Following Code
;==============================================================================
fname='dat'
nstep=time/tstep ;time/timestep
nm=nstep-1

um=fltarr(17,34)
vm=fltarr(17,34)
wm=fltarr(17,34)
tm=fltarr(17,34)
pm=fltarr(17,34)
om=fltarr(17,34)
vpm=fltarr(17,34)
vdm=fltarr(17,34)


u=fltarr(17,34,nstep)
v=fltarr(17,34,nstep)
w=fltarr(17,34,nstep)
temp=fltarr(17,34,nstep)
pressure=fltarr(17,34,nstep)
ozone=fltarr(17,34,nstep)

date=findgen(nstep)*(2.*tstep/365.)
ulevels2=[-30, -25, -20, -15, -10, -5, 5, 10, 15,  20, 25, 30]
ulevels3=findgen(15)*3.-12
lab=findgen(20)*0+1
lat=findgen(17)*2.7-1.35

ht=findgen(34)*0.5+17.


openr,37,inputfile+fname,/f77_unformatted & title='base'
for n=0,nm do begin
    readu,37,um,vm,wm,tm,om,pm
    u(*,*,n)=um/100.
    v(*,*,n)=vm
    w(*,*,n)=wm
    temp(*,*,n)=tm*700000./2.87e6
    pressure(*,*,n)=pm/10000.
    ozone(*,*,n)=om
endfor
close,37

save,filename=outputsavefile,u,v,w,temp,pressure,ozone,date,ulevels2,ulevels3,lab,$
lat,ht
stop
end

Python Code: #create save file for QBO model output

#---------------------------------------------------------------------
#input data here (Only adjust this info)
#---------------------------------------------------------------------

time=7300  #duration from fortran code
tstep=5   #time step from fortran code
inputfile='/home/cwilliams/metr51/lab12/uvwtom.dat'# binary file from fortran
outputsavefile='~cwilliams/metr51/lab12/qbo_FULLX.sav'# name of save file with variables
#--------------------------------------------------------------------------------
#--------------------------------------------------------------------------------


import numpy as n


#==============================================================================
#   Do Not Adjust the Following Code
#==============================================================================
nstep=time/tstep #time/timestep
nm=nstep-1

um=n.empty((17,34))
vm=n.empty((17,34))
wm=n.empty((17,34))
tm=n.empty((17,34))
pm=n.empty((17,34))
om=n.empty((17,34))
vpm=n.empty((17,34))
vdm=n.empty((17,34))


u=n.empty((17,34,nstep))
v=n.empty((17,34,nstep))
w=n.empty((17,34,nstep))
temp=n.empty((17,34,nstep))
pressure=n.empty((17,34,nstep))
ozone=n.empty((17,34,nstep))

date=n.arange(nstep)*(2.*tstep/365.)
ulevels2=n.array([-30, -25, -20, -15, -10, -5, 5, 10, 15,  20, 25, 30])
ulevels3=n.arange(15)*3.-12
lab=n.arange(20)*0+1
lat=n.arange(17)*2.7-1.35

ht=n.arange(34)*0.5+17.

This is where I need some assitance:

f=open(inputfile,'rb')     
data=f.read()
for i in range(nm+1):
#    readu,37,um,vm,wm,tm,om,pm
#    u(:,:,i)=um/100.
#    v(:,:,i)=vm
#    w(:,:,i)=wm
#    temp(:,:,i)=tm*700000./2.87e6
#    pressure(:,:,i)=pm/10000.
#    ozone(*,*,n)=om
#
#n.savez(filename=outputsavefile,u=u,v=v,w=w,temp=temp,pressure=pressure,ozone=ozone,date=date,ulevels2=ulevels2,ulevels3=ulevels3,lab=lab,lat=lat,ht=ht)

I know there is an issue with the row/column order between IDL and Python but I think I can fix this once I read the code in.

解决方案

First off, I have to assume that you wrote the original Fortran file using sequential access rather than direct access, this is a very important distinction. I do know IDL and the read command you have set up in your example is consistent with sequential access.

This matters because Fortran sequential access adds 'record markers' to the file for every call to WRITE(), so you'll have to account for that in your Python read routine, much as in the answer that george linked in the comments.

As to row/column order, it really just matters which dimension is iterating the fastest in memory. A Fortran 2D array, for example, will write in memory with the first array dimension as the fastest iterating. When you read the same thing into a 2D Python variable, it will be the last dimension - keep that in mind when reshaping a flat array, and you'll have to transpose it if you want to keep your indexing.

I think the following code will do approximately what you want, for brevity I just included your um and vm variables:

import numpy as np

um=np.empty((34,17), dtype='float32') # Make these dimensions "backwards" for easier reshaping
vm=np.empty((34,17), dtype='float32') # Also watch out, I believe the default type is float64

f = open(inputfile,'rb')
recl = np.zeros(1,dtype=np.uint32)
for i in range(nm+1):
    recl = np.fromfile(f, dtype='uint32', count=1)
    tmpu = np.fromfile(f, dtype='float32', count=um.size) # These arrays will be flat
    tmpv = np.fromfile(f, dtype='float32', count=vm.size)
    recl = np.fromfile(f, dtype='uint32', count=1)

    um = np.transpose(np.reshape(tmpu, um.shape))
    vm = np.transpose(np.reshape(tmpv, vm.shape))

Then you may also do whatever you wish in the loop with your other variables, and you should be able to index um and vm like you are used to in IDL or Fortran. This might not be the easiest or best way to do it, but I think it will at least get you started.

这篇关于如何将fortran 77未格式化的二进制文件读入python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆