读取大尾数二进制文件 [英] Read a large big-endian binary file

查看:195
本文介绍了读取大尾数二进制文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个很大的big-endian二进制文件.我知道这个文件中有多少个数字.我找到了一个解决方案,该方法如何使用struct读取big-endian文件,如果文件很小,它可以完美地工作:

I have a very large big-endian binary file. I know how many numbers in this file. I found a solution how to read big-endian file using struct and it works perfect if file is small:

    data = []
    file = open('some_file.dat', 'rb')

    for i in range(0, numcount)
            data.append(struct.unpack('>f', file.read(4))[0])

但是,如果文件大小超过100 mb,此代码将非常缓慢地工作.我当前的文件大小为1.5gb,包含399.513.600浮点数.上面的代码使用此文件大约需要8分钟.

But this code works very slow if file size is more than ~100 mb. My current file has size 1.5gb and contains 399.513.600 float numbers. The above code works with this file an about 8 minutes.

我找到了另一种更快的解决方案:

I found another solution, that works faster:

    datafile = open('some_file.dat', 'rb').read()
    f_len = ">" + "f" * numcount   #numcount = 399513600

    numbers = struct.unpack(f_len, datafile)

这段代码在大约1.5分钟内运行,但这对我来说太慢了.早些时候,我在Fortran中编写了相同的功能代码,并且运行大约10秒钟.

This code runs in about ~1.5 minute, but this is too slow for me. Earlier I wrote the same functional code in Fortran and it run in about 10 seconds.

在Fortran中,我用"big-endian"标志打开文件,我可以直接读取REAL数组中的文件而无需任何转换,但是在python中,我必须将文件读取为字符串,并使用struct转换float中的每4位.是否可以使程序运行更快?

In Fortran I open the file with flag "big-endian" and I can simply read file in REAL array without any conversion, but in python I have to read file as a string and convert every 4 bites in float using struct. Is it possible to make the program run faster?

推荐答案

您可以使用 numpy.fromfile 读取文件,并在 dtype 参数:

numpy.fromfile(filename, dtype='>f')

有一个 array.fromfile 方法,但是不幸的是,我看不到可以控制字节序的任何方式,因此根据您的用例,这可能避免依赖第三方库或变得毫无用处.

There is an array.fromfile method too, but unfortunately I cannot see any way in which you can control endianness, so depending on your use case this might avoid the dependency on a third party library or be useless.

这篇关于读取大尾数二进制文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆