在Python中从Fortran二进制文件中读取记录 [英] Reading records from a Fortran binary file in Python
问题描述
我需要使用名为merg_2015041312_4km-pixel.Z
的python读取Fortran二进制文件(来自此处中定义了未压缩文件的结构.定义说
I need to read a Fortran binary file with python, named merg_2015041312_4km-pixel.Z
(from here), that is compressed; the structure of uncompressed file is defined here. The definition says that
每个文件包含2条记录:第一条为按小时"图像(:00"),第二条为按小时"图像(:30").
Each file contains 2 records: the 1st for the "on the hour" images (":00") and the 2nd for the "on the half hour" images (":30").
和
每条记录是一个9896 x 3298红外亮度温度的Fortran阵列,通过从每个数据中减去"75"来缩放比例以适合1字节.
Each record is a 9896 x 3298 Fortran array of IR brightness temperatures that have been scaled to fit into 1-byte by subtracting "75" from each datum.
GrADS .ctl文件描述:
GrADS .ctl file description:
DSET merg_1999042012_4km-pixel
OPTIONS yrev little_endian template
UNDEF 330
TITLE globally merged IR data
XDEF 9896 LINEAR 0.0182 0.036378335
YDEF 3298 LINEAR -59.982 0.036383683
ZDEF 01 LEVELS 1
TDEF 99999 LINEAR 12z04Apr1999 30mn
VARS 1
ch4 1 -1,40,1,-1 IR BT (add '75' to this value)
ENDVARS
我尝试编写一些python代码:
and I tried to write some python code:
>>> import struct
>>> file = open("merg_2015041312_4km-pixel", 'rb')
>>> data = struct.unpack('>h', file.read())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
struct.error: unpack requires a string argument of length 2
不幸的是,我不习惯二进制文件...
Unfortunately, I'm not used to binary files...
如何从此文件中获取第二条记录(半小时)?
How can I obtain the second record (half hourly) from this file?
推荐答案
通过阅读数据集,很明显该文件是一个(压缩的)Fortran直接访问文件,具有2个大小为(9896, 3298)
的IR数据记录,已缩减为适合1字节,方法是从中减去75价值.目前尚不清楚结果字节是未签名的还是未签名的,因为我没有使用 GrADS 控件的经验定义.
From reading the dataset description and testing with a dataset it is clear that the file is a (compressed) Fortran direct-access file with 2 records of IR data of size (9896, 3298)
downscaled to fit in 1 byte by subtracting 75 from the values. It is not clear if the resulting byte is unsigned or signed, because I have no experience with GrADS control definitions.
numpy.fromfile
是用于轻松读取二进制Fortran的工具,访问文件.
Use either int8
or uint8
as dtype to build your record data type object, check which one makes sense. For
example if the data is IR temperatures in Kelvins, using uint8
would
result in min 186 and max 330 after scaling (~ -87°C to 56.8°C).
Upcast to big enough type for upscaling, used float
here, could be
int16
, int32
etc.
H = 9896
W = 3298
Record = np.dtype(('uint8', H*W))
A = np.fromfile('merg_2015041312_4km-pixel',
dtype=Record, count=2).astype('float') + 75
必须对所得的1d数组进行整形,以更正尺寸和形状.数据类型对象将支持子数组,但是始终以C-连续内存布局.
The resulting 1d-arrays have to be reshaped to correct dimensions and shape. Data type objects would support sub-arrays, but they are always read in C-contiguous memory layout.
I_on_the_hour = A[0].reshape((H, W), order='F') # Fortran data order
I_on_the_half_hour = A[1].reshape((H, W), order='F')
检查结果是否合理(在ipython --pylab
中)
Check that the results look sane (in ipython --pylab
)
plt.imshow(I_on_the_half_hour)
这篇关于在Python中从Fortran二进制文件中读取记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!