如何在Python中分批循环二进制文件 [英] How to loop over a binary file in Python in chunks

查看:201
本文介绍了如何在Python中分批循环二进制文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Python遍历一个充满8字节记录的长二进制文件.

I'm trying to use Python to loop over a long binary file filled with 8-byte records.

每个记录的格式为[ uint16 | uint16 | uint32 ]
(在结构格式中为"HHI")

Each record has the format [ uint16 | uint16 | uint32 ]
(which is "HHI" in struct-formatting)

显然,每个8字节块都被视为int,而不是8字节数组,从而导致struct.unpack调用失败

Apparently each 8-byte block is getting treated as an int, instead of an array of 8-bytes, then causing the struct.unpack call to fail

with open(fname, "rb") as f:
    sz=struct.calcsize("HHI")
    print(sz)                # This shows 8, as expected 
    for raw in f.read(sz):   # Expect this should read 8 bytes into raw
        print(type(raw))     # This says raw is an 'int', not a byte-array
        record=struct.unpack("HHI", raw ) # "TypeError: a bytes-like object is required, not 'int'"
        print(record)

如何将我的文件读为一系列结构,并分别打印出来?

How can I read my file as a series of structures, and print them each out?

推荐答案

iter 内置的,如果传递了一个callable,并且哨兵值将反复调用该可调用对象,直到返回哨兵值.

The iter builtin, if passed a callable and a sentinel value will call the callable repeatedly until the sentinel value is returned.

因此您可以使用 functools.partial (或使用lambda)并将其传递给iter,如下所示:

So you can create a partial function with functools.partial (or use a lambda) and pass it to iter, like this:

with open('foo.bin', 'rb') as f:
    chunker = functools.partial(f.read, 8)
    for chunk in iter(chunker, b''):      # Read 8 byte chunks until empty byte returned
        # Do stuff with chunk

这篇关于如何在Python中分批循环二进制文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆