有人可以解释Python结构解压缩吗? [英] Can someone explain Python struct unpacking?

查看:28
本文介绍了有人可以解释Python结构解压缩吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个由C结构制成的二进制文件,该文件要在Python中解析.我知道二进制文件的确切格式和布局,但是我对如何使用Python Struct解压缩来读取此数据感到困惑.

I have a binary file made from C structs that I want to parse in Python. I know the exact format and layout of the binary but I am confused on how to use Python Struct unpacking to read this data.

基于结构的成员是什么,我是否必须遍历整个二进制文件一次解压缩一定数量的字节?

Would I have to traverse the whole binary unpacking a certain number of bytes at a time based on what the members of the struct are?

C文件格式:

typedef struct {
  int data1;
  int data2;
  int data4;
} datanums;

typedef struct {
  datanums numbers;
  char *name;
 } personal_data;

让我们说一下,该二进制文件具有一个又一个的重复的personal_data结构.

Lets say the binary file had personal_data structs repeatedly after another.

推荐答案

假定布局是静态二进制结构,可以通过简单的 struct 模式描述,并且文件就是重复的结构一遍又一遍,然后是的,一次遍历整个二进制文件解压缩一定数量的字节"正是您要做的.

Assuming the layout is a static binary structure that can be described by a simple struct pattern, and the file is just that structure repeated over and over again, then yes, "traverse the whole binary unpacking a certain number of bytes at a time" is exactly what you'd do.

例如:

record = struct.Struct('>HB10cL')

with open('myfile.bin', 'rb') as f:
    while True:
        buf = f.read(record.size)
        if not buf:
            break
        yield record.unpack(buf)

如果您担心一次只读取17个字节的效率,并且想一次一次地缓存8K来封装它,那么……首先,请确保这是一个值得优化的实际问题;然后,如果不是,则循环遍历 unpack_from 而不是 unpack .像这样的东西(未经测试的,我的头顶代码):

If you're worried about the efficiency of only reading 17 bytes at a time and you want to wrap that up by buffering 8K at a time or something… well, first make sure it's an actual problem worth optimizing; then, if it is, loop over unpack_from instead of unpack. Something like this (untested, top-of-my-head code):

buf, offset = b'', 0
with open('myfile.bin', 'rb') as f:
    if len(buf) < record.size:
        buf, offset = buf[offset:] + f.read(8192), 0
        if not buf:
            break
    yield record.unpack_from(buf, offset)
    offset += record.size

或者,甚至更简单,只要文件对于您的vmsize而言不是太大,只需对整个文件进行 mmap 并在 mmap 本身:

Or, even simpler, as long as the file isn't too big for your vmsize, just mmap the whole thing and unpack_from on the mmap itself:

with open('myfile.bin', 'rb') as f:
    with mmap.mmap(f, 0, access=mmap.ACCESS_READ) as m:
        for offset in range(0, m.size(), record.size):
            yield record.unpack_from(m, offset)

这篇关于有人可以解释Python结构解压缩吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆