通过readinto解析二进制数据到ctypes的结构对象() [英] Parsing binary data into ctypes Structure object via readinto()

查看:1152
本文介绍了通过readinto解析二进制数据到ctypes的结构对象()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想处理的二进制格式,在此之后的例如:

I'm trying to handle a binary format, following the example here:

http://dabeaz.blogspot.jp/2009/ 08 / Python的二进制-IO-handling.html

>>> from ctypes import *
>>> class Point(Structure):
>>>     _fields_ = [ ('x',c_double), ('y',c_double), ('z',c_double) ]
>>>
>>> g = open("foo","rb") # point structure data
>>> q = Point()
>>> g.readinto(q)
24
>>> q.x
2.0

我定义我的头的结构,我尝试将数据读入到我的结构,但我有一些困难。
我的结构是这样的:

I've defined a Structure of my header and I'm trying to read data into my structure, but I'm having some difficulty. My structure is like this:

class BinaryHeader(BigEndianStructure):
    _fields_ = [
                ("sequence_number_4bytes", c_uint),
                ("ascii_text_32bytes", c_char),
                ("timestamp_4bytes", c_uint),
                ("more_funky_numbers_7bytes", c_uint, 56),
                ("some_flags_1byte", c_byte),
                ("other_flags_1byte", c_byte),
                ("payload_length_2bytes", c_ushort),

                ] 

借助 ctypes的文档说:

对于像c_int的整数类型的字段,第三个可选项目可以是
  给出。它必须是一个小的正整数定义的位宽度
  该字段。

For integer type fields like c_int, a third optional item can be given. It must be a small positive integer defining the bit width of the field.

因此​​,对于(more_funky_numbers_7bytes,c_uint,56)我试图定义字段作为7字节的字段,但我得到的错误:

So for ("more_funky_numbers_7bytes", c_uint, 56), I've tried to define the field as a 7 byte field, but I'm getting the error:

ValueError错误:位域无效位数

ValueError: number of bits invalid for bit field

所以我的第一个问题,就是我可以定义一个字节的7场INT?

So my first problem, is how can I define a 7 byte int field?

然后,如果我跳过这一问题,注释掉more_funky_numbers_7bytes字段中,得出的数据获取的加载..但预期只有1个字符被装入ascii_text_32bytes。出于某种原因,返回 16 我以为是读入结构中的字节数计算......但如果​​我注释掉我的时髦号字段ascii_text_32bytes只给出一个字符(1字节),它就不能成为13,不是16 ???

Then If I skip that problem and comment out the "more_funky_numbers_7bytes" field, the resulting data get's loaded in.. but as expected only 1 character is loaded into "ascii_text_32bytes". And for some reason returns 16 which I assume is the calculated number of bytes it read into the structure... but If I'm commenting out my "funky number" field and ""ascii_text_32bytes" is only giving one char (1 byte), shouldn't that be 13, not 16???

然后我试图打破炭场到一个单独的结构,和参考,从我的头结构之内。但是,这不是工作要么...

Then I tried breaking out the char field into a separate structure, and reference that from within my Header structure. But that's not working either...

class StupidStaticCharField(BigEndianStructure):
    _fields_ = [
                ("ascii_text_1", c_byte),
                ("ascii_text_2", c_byte),
                ("ascii_text_3", c_byte),
                ("ascii_text_4", c_byte),
                ("ascii_text_5", c_byte),
                ("ascii_text_6", c_byte),
                ("ascii_text_7", c_byte),
                ("ascii_text_8", c_byte),
                ("ascii_text_9", c_byte),
                ("ascii_text_10", c_byte),
                ("ascii_text_11", c_byte),
                .
                .
                .
                ]

class BinaryHeader(BigEndianStructure):
    _fields_ = [
                ("sequence_number_4bytes", c_uint),
                ("ascii_text_32bytes", StupidStaticCharField),
                ("timestamp_4bytes", c_uint),
                #("more_funky_numbers_7bytes", c_uint, 56),
                ("some_flags_1byte", c_ushort),
                ("other_flags_1byte", c_ushort),
                ("payload_length_2bytes", c_ushort),

                ] 

因此​​,任何想法如何:

So, any ideas how to:


  1. 定义7字节域(我将需要使用自定义函数脱code)

  2. 定义一个32字节的字符静态字段

更新

我发现,似乎工作的结构...

I've found a structure that seems to work...

class BinaryHeader(BigEndianStructure):
    _fields_ = [
                ("sequence_number_4bytes", c_uint),
                ("ascii_text_32bytes", c_char * 32),
                ("timestamp_4bytes", c_uint),
                ("more_funky_numbers_7bytes", c_byte * 7),
                ("some_flags_1byte", c_byte),
                ("other_flags_1byte", c_byte),
                ("payload_length_2bytes", c_ushort),

                ]  

不过,现在我剩下的问题是,为什么当使用 .readinto()

f = open(binaryfile, "rb")

mystruct = BinaryHeader()
f.readinto(mystruct)

它返回 52 ,而不是预期的, 51 。哪里是额外的字节来自哪里,以及它在哪里去了?

It's returning 52 and not the expected, 51. Where is that extra byte coming from, and where does it go?

更新2
对于那些有兴趣在这里是一个例如替代的结构法值读取到由eryksun提到namedtuple:

UPDATE 2 For those interested here's an example of an alternative struct method to read values into a namedtuple mentioned by eryksun:

>>> record = 'raymond   \x32\x12\x08\x01\x08'
>>> name, serialnum, school, gradelevel = unpack('<10sHHb', record)

>>> from collections import namedtuple
>>> Student = namedtuple('Student', 'name serialnum school gradelevel')
>>> Student._make(unpack('<10sHHb', record))
Student(name='raymond   ', serialnum=4658, school=264, gradelevel=8)

推荐答案

这行定义实际上是定义一个位域

解决方案
This line definition is actually for defining a bitfield:

这是错在这里。位字段的大小应小于或等于类型的大小,所以 c_uint 应该是最多32个,一个额外的位将引发异常:

... ("more_funky_numbers_7bytes", c_uint, 56), ...

which is wrong here. The size of a bitfield should be less than or equals the size of the type, so c_uint should be at most 32, one extra bit will raise the exception:

使用位域的例子:

ValueError: number of bits invalid for bit field

Example of using the bitfield:

您需要的是7个字节所以你endup做是正确的:

from ctypes import * class MyStructure(Structure): _fields_ = [ # c_uint8 is 8 bits length ('a', c_uint8, 4), # first 4 bits of `a` ('b', c_uint8, 2), # next 2 bits of `a` ('c', c_uint8, 2), # next 2 bits of `a` ('d', c_uint8, 2), # since we are beyond the size of `a` # new byte will be create and `d` will # have the first two bits ] mystruct = MyStructure() mystruct.a = 0b0000 mystruct.b = 0b11 mystruct.c = 0b00 mystruct.d = 0b11 v = c_uint16() # copy `mystruct` into `v`, I use Windows cdll.msvcrt.memcpy(byref(v), byref(mystruct), sizeof(v)) print sizeof(mystruct) # 2 bytes, so 6 bits are left floating, you may # want to memset with zeros print bin(v.value) # 0b1100110000

what you need is 7 bytes so what you endup doing is correct:

至于大小的结构,这将是52岁,我额外的字节填充到对齐结构的上32位处理器4个字节或64比特的8个字节。这里:

... ("more_funky_numbers_7bytes", c_byte * 7), ...

As for the size for the structure, It's going to be 52, I extra byte will be padded to align the structure on 4 bytes on 32 bit processor or 8 bytes on 64 bits. Here:

额外的字节 other_flags_1byte payload_length_2bytes 在文件之间填充

The extra byte is padded between other_flags_1byte and payload_length_2bytes in the file:

00000000 11 11 11 11 ....
00000004 22 22 22 22 """"
00000008 22 22 22 22 """"
0000000C 22 22 22 22 """"
00000010 22 22 22 22 """"
00000014 22 22 22 22 """"
00000018 22 22 22 22 """"
0000001C 22 22 22 22 """"
00000020 22 22 22 22 """"
00000024 33 33 33 33 3333
00000028 44 44 44 44 DDDD
0000002C 44 44 44 55 DDDU
00000030 66 00 77 77 f.ww
            ^
         extra byte

这是一个问题,当它涉及到的文件格式和网络协议。要想改变它由1包吧:

This is an issue when it comes to the file formats and network protocols. To change it pack it by 1:

 ...
class BinaryHeader(BigEndianStructure):
    _pack_ = 1
    _fields_ = [
        ("sequence_number_4bytes", c_uint),
...

该文件将是:

00000000 11 11 11 11 ....
00000004 22 22 22 22 """"
00000008 22 22 22 22 """"
0000000C 22 22 22 22 """"
00000010 22 22 22 22 """"
00000014 22 22 22 22 """"
00000018 22 22 22 22 """"
0000001C 22 22 22 22 """"
00000020 22 22 22 22 """"
00000024 33 33 33 33 3333
00000028 44 44 44 44 DDDD
0000002C 44 44 44 55 DDDU
00000030 66 77 77    fww 

至于结构,也不会让它在你的情况更容易。可悲的是它不支持的格式嵌套元组。例如在这里:

As for struct, it won't make it easier in your case. Sadly it doesn't support nested tuples in the format. For example here:

>>> from struct import *
>>>
>>> data = '\x11\x11\x11\x11\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22
\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x33
\x33\x33\x33\x44\x44\x44\x44\x44\x44\x44\x55\x66\x77\x77'
>>>
>>> BinaryHeader = Struct('>I32cI7BBBH')
>>>
>>> BinaryHeader.unpack(data)
(286331153, '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"'
, '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"'
, '"', '"', 858993459, 68, 68, 68, 68, 68, 68, 68, 85, 102, 30583)
>>>

这导致无法使用 namedtuple ,你仍然有解析它基于索引。这将工作,如果你可以这样做'I标记(32℃)(I)(7B)(B)(B)H'。这个功能已经在这里请(扩展struct.unpack产生嵌套结构)自2003年起,但没有被做过以来

This result cannot be used namedtuple, you still have parse it based on the index. It would work if you can do something like '>I(32c)(I)(7B)(B)(B)H'. This feature has been requested here (Extend struct.unpack to produce nested tuples) since 2003 but nothing is done since.

这篇关于通过readinto解析二进制数据到ctypes的结构对象()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆