解压以 ASCIIZ 字符串结尾的结构体 [英] Unpacking a struct ending with an ASCIIZ string

查看:36
本文介绍了解压以 ASCIIZ 字符串结尾的结构体的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 struct.unpack() 来拆分以 ASCII 字符串结尾的数据记录.

该记录(恰好是 TomTom ov2 记录)具有以下格式(存储为小端):

  • 1 字节
  • 4 字节整数表示总记录大小(包括此字段)
  • 4 字节整数
  • 4 字节整数
  • 可变长度字符串,以空字符结尾

unpack() 要求字符串的长度包含在您传递的格式中.我可以使用第二个字段和记录其余部分的已知大小(13 个字节)来获取字符串长度:

str_len = struct.unpack("

然后继续完全解包,但由于字符串以空字符结尾,我真的希望 unpack() 能为我做这件事.如果我遇到一个不包含其自身大小的结构,那么拥有它也会很好.

我怎样才能做到这一点?

解决方案

实际上,无尺寸记录相当容易处理,因为 struct.calcsize() 会告诉你它期望的长度.您可以使用它和数据的实际长度为 unpack() 构造一个新的格式字符串,其中包含正确的字符串长度.

这个函数只是对 unpack() 的一个包装,允许在最后一个位置添加一个新的格式字符,将删除终端 NUL:

导入结构def unpack_with_final_asciiz(fmt, dat):"""解包二进制数据,最后处理以空字符结尾的字符串(并且仅在最后)自动.第一个参数 fmt 是一个 struct.unpack() 格式的字符串以下修改:如果 fmt 的最后一个字符是 'z',则返回的字符串将删除 NUL.如果是没有长度的's',则返回包含NUL的字符串.如果它是带有长度的 's',则行为与正常的 unpack() 相同."""# 如果不需要特殊行为,则直接传递如果 fmt[-1] 不在 ('z', 's') 或 (fmt[-1] == 's' and fmt[-2].isdigit()) 中:返回 struct.unpack(fmt, dat)# 使用格式字符串获取包含的字符串和剩余记录的大小non_str_len = struct.calcsize(fmt[:-1])str_len = len(dat) - non_str_len# 设置新的格式字符串# 如果传入 'z',则将终止 NUL 视为填充字节"如果 fmt[-1] == 'z':str_fmt = "{0}sx".format(str_len - 1)别的:str_fmt = "{0}s".format(str_len)new_fmt = fmt[:-1] + str_fmt返回 struct.unpack(new_fmt, dat)

<小时><预><代码>>>>dat = b'\x02\x1e\x00\x00\x00z\x8eJ\x00\xb1\x7f\x03\x00在河边\x00'>>>unpack_with_final_asciiz("<biiiz", dat)(2, 30, 4886138, 229297, b'在河边')

I am trying to use struct.unpack() to take apart a data record that ends with an ASCII string.

The record (it happens to be a TomTom ov2 record) has this format (stored little-endian):

  • 1 byte
  • 4 byte int for total record size (including this field)
  • 4 byte int
  • 4 byte int
  • variable-length string, null-terminated

unpack() requires that the string's length be included in the format you pass it. I can use the second field and the known size of the rest of the record -- 13 bytes -- to get the string length:

str_len = struct.unpack("<xi", record[:5])[0] - 13
fmt = "<biii{0}s".format(str_len)

then proceed with the full unpacking, but since the string is null-terminated, I really wish unpack() would do it for me. It'd also be nice to have this should I run across a struct that doesn't include its own size.

How can I make that happen?

解决方案

The size-less record is fairly easy to handle, actually, since struct.calcsize() will tell you the length it expects. You can use that and the actual length of the data to construct a new format string for unpack() that includes the correct string length.

This function is just a wrapper for unpack(), allowing a new format character in the last position that will drop the terminal NUL:

import struct
def unpack_with_final_asciiz(fmt, dat):
    """
    Unpack binary data, handling a null-terminated string at the end 
    (and only at the end) automatically.

    The first argument, fmt, is a struct.unpack() format string with the 
    following modfications:
    If fmt's last character is 'z', the returned string will drop the NUL.
    If it is 's' with no length, the string including NUL will be returned.
    If it is 's' with a length, behavior is identical to normal unpack().
    """
    # Just pass on if no special behavior is required
    if fmt[-1] not in ('z', 's') or (fmt[-1] == 's' and fmt[-2].isdigit()):
        return struct.unpack(fmt, dat)

    # Use format string to get size of contained string and rest of record
    non_str_len = struct.calcsize(fmt[:-1])
    str_len = len(dat) - non_str_len

    # Set up new format string
    # If passed 'z', treat terminating NUL as a "pad byte"
    if fmt[-1] == 'z':
        str_fmt = "{0}sx".format(str_len - 1)
    else:
        str_fmt = "{0}s".format(str_len)
    new_fmt = fmt[:-1] + str_fmt

    return struct.unpack(new_fmt, dat)


>>> dat = b'\x02\x1e\x00\x00\x00z\x8eJ\x00\xb1\x7f\x03\x00Down by the river\x00'
>>> unpack_with_final_asciiz("<biiiz", dat)
(2, 30, 4886138, 229297, b'Down by the river')

这篇关于解压以 ASCIIZ 字符串结尾的结构体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆