将整个二进制文件读入Python [英] Reading an entire binary file into Python
问题描述
我需要从Python导入一个二进制文件-内容是带符号的16位整数,大端.
I need to import a binary file from Python -- the contents are signed 16-bit integers, big endian.
以下堆栈溢出问题建议如何一次提取几个字节,但这是放大以读取整个文件的方法吗?
The following Stack Overflow questions suggest how to pull in several bytes at a time, but is this the way to scale up to read in a whole file?
我想创建一个像这样的函数
I thought to create a function like:
from numpy import *
import os
def readmyfile(filename, bytes=2, endian='>h'):
totalBytes = os.path.getsize(filename)
values = empty(totalBytes/bytes)
with open(filename, 'rb') as f:
for i in range(len(values)):
values[i] = struct.unpack(endian, f.read(bytes))[0]
return values
filecontents = readmyfile('filename')
但这非常慢(文件为165924350字节).有更好的方法吗?
But this is quite slow (the file is 165924350 bytes). Is there a better way?
推荐答案
我将直接读取直到EOF(这意味着检查是否接收到空字符串),然后不再需要使用range()和getsize.
另外,使用xrange
(而不是range
)应该可以改善性能,尤其是对于内存使用.
而且,正如Falmarri建议的那样,同时读取更多数据将大大提高性能.
I would directly read until EOF (it means checking for receiving an empty string), removing then the need to use range() and getsize.
Alternatively, using xrange
(instead of range
) should improve things, especially for memory usage.
Moreover, as Falmarri suggested, reading more data at the same time would improve performance quite a lot.
也就是说,我不会指望奇迹,也是因为我不确定列表是存储所有数量数据的最有效方法.
使用NumPy的数组及其用于读取/写入二进制文件的工具怎么样?在该链接中,有一节介绍了如何使用numpyio.fread读取原始二进制文件.我相信这应该正是您所需要的.
That said, I would not expect miracles, also because I am not sure a list is the most efficient way to store all that amount of data.
What about using NumPy's Array, and its facilities to read/write binary files? In this link there is a section about reading raw binary files, using numpyio.fread. I believe this should be exactly what you need.
注意:就我个人而言,我从未使用过NumPy;但是,它的主要存在目的恰恰是处理大量数据-这就是您在问题中正在做的事情.
Note: personally, I have never used NumPy; however, its main raison d'etre is exactly handling of big sets of data - and this is what you are doing in your question.
这篇关于将整个二进制文件读入Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!