struct:类型注册? [英] struct: type registration?

查看:62
本文介绍了struct:类型注册?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你好,


考虑到正在进行的struct工作(我认为这是一个死模块),我是

想知道是否有可能添加一个API来注册自定义解析

代码的结构。每当我将它用于非平凡的任务时,我总是碰巧

编写小包装器函数来调整struct返回的值。


示例API会如下:


================================== ==========

def mystring_len():

返回20

def mystring_pack(s) :

如果len(s)> 20:

提高ValueError,一个mystring可以最多20个字符

s =(s +" \0" * 20)[:20]

s = struct.pack(" 20s",s)

返回s


def mystring_unpack(s):

断言len(s)== 20

s = struct.unpack(" 20s",s)[0]

idx = s.find( " \0")

如果idx> = 0:

s = s [:idx]

返回s


struct.register(" S",mystring_pack,mystring_unpack,mystring_len)

#然后

foo = struct。 unpack(iilS,数据)

================================= ===========


这只是一个例子,任何类似的API可能都适合更好的结构

内部也可以。


如图所示,自定义打包/打包机可以将原始打包/打开包装作为他们工作的

基础。我猜这个问题可能是结束问题

问题:如果在递归调用时,struct.pack / unpack

默认情况下使用指定的endianess是有意义的通过外部格式字符串。

-

Giovanni Bajo

Hello,

given the ongoing work on struct (which I thought was a dead module), I was
wondering if it would be possible to add an API to register custom parsing
codes for struct. Whenever I use it for non-trivial tasks, I always happen to
write small wrapper functions to adjust the values returned by struct.

An example API would be the following:

============================================
def mystring_len():
return 20

def mystring_pack(s):
if len(s) > 20:
raise ValueError, "a mystring can be at max 20 chars"
s = (s + "\0"*20)[:20]
s = struct.pack("20s", s)
return s

def mystring_unpack(s):
assert len(s) == 20
s = struct.unpack("20s", s)[0]
idx = s.find("\0")
if idx >= 0:
s = s[:idx]
return s

struct.register("S", mystring_pack, mystring_unpack, mystring_len)

# then later
foo = struct.unpack("iilS", data)
============================================

This is only an example, any similar API which might fit better struct
internals would do as well.

As shown, the custom packer/unpacker can call the original pack/unpack as a
basis for their work. I guess an issue with this could be the endianess
problem: it would make sense if, when called recursively, struct.pack/unpack
used by the default the endianess specified by the external format string.
--
Giovanni Bajo

推荐答案

On 1 / 06/2006 10:50 AM,Giovanni Bajo写道:
On 1/06/2006 10:50 AM, Giovanni Bajo wrote:
你好,

鉴于正在进行的结构工作(我认为这是一个死模块),我是
想知道是否可以添加API来注册自定义解析结构的代码。每当我将它用于非平凡的任务时,我总是碰巧编写小的包装器函数来调整struct返回的值。

示例API如下:

============================================
def mystring_len():
返回20
def mystring_pack(s):
如果len(s)> 20:
引发ValueError,一个mystring可以是最多20个字符
s =(s +\0" * 20)[:20]


您是否考虑过s.ljust(20," \0")?

s = struct.pack(" 20s",s)
返回s


我是个白痴,所以请温柔地对待我:我不明白你为什么要使用struct.pack $


|>>> import struct

|>>> x =(" abcde" +" \0" * 20)[:20]

|>>> x

''abcde \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \x00 \ x00''

|>>> len(x)

20

|>>> y = struct.pack(" 20s",x)

|>>> y == x

真实

|>>>


对我来说看起来像一个大胖子的无操作;你自己完成了所有繁重的工作。

def mystring_unpack(s):
断言len(s)== 20
s = struct.unpack(" 20s",s)[0]


Errrm,g''day,再次是那个讨厌的白痴:

|>>> z = struct.unpack(" 20s",y)[0]

|>>> z

''abcde \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \x00 \ x00''

|>>> z == y == x

True

idx = s.find(" \0")
如果idx> = 0:
s = s [:idx]
返回s


您是否考虑过这个:


|>>> ; z.rstrip(" \0")

''abcde''

|>>> (" \0" * 20).rstrip(" \0")

''''

|>>> (" x" * 20).rstrip(" \0")

''xxxxxxxxxxxxxxxxxxxxx''

struct.register(" S",mystring_pack, mystring_unpack,mystring_len)

#then later
foo = struct.unpack(" iilS",data)
============== ==============================

这只是一个例子,任何类似的API都可能适合更好的结构
内部也可以。

如图所示,自定义打包/拆包可以打电话给原包/打包作为他们工作的基础。我猜这个问题可能是endianess
问题:如果在递归调用时,struct.pack / unpack
默认使用外部格式字符串指定的endianess,那将是有意义的。
Hello,

given the ongoing work on struct (which I thought was a dead module), I was
wondering if it would be possible to add an API to register custom parsing
codes for struct. Whenever I use it for non-trivial tasks, I always happen to
write small wrapper functions to adjust the values returned by struct.

An example API would be the following:

============================================
def mystring_len():
return 20

def mystring_pack(s):
if len(s) > 20:
raise ValueError, "a mystring can be at max 20 chars"
s = (s + "\0"*20)[:20]
Have you considered s.ljust(20, "\0") ?
s = struct.pack("20s", s)
return s
I am an idiot, so please be gentle with me: I don''t understand why you
are using struct.pack at all:

|>>> import struct
|>>> x = ("abcde" + "\0" * 20)[:20]
|>>> x
''abcde\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 \x00\x00\x00\x00''
|>>> len(x)
20
|>>> y = struct.pack("20s", x)
|>>> y == x
True
|>>>

Looks like a big fat no-op to me; you''ve done all the heavy lifting
yourself.

def mystring_unpack(s):
assert len(s) == 20
s = struct.unpack("20s", s)[0]
Errrm, g''day, it''s that pesky idiot again:

|>>> z = struct.unpack("20s", y)[0]
|>>> z
''abcde\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 \x00\x00\x00\x00''
|>>> z == y == x
True
idx = s.find("\0")
if idx >= 0:
s = s[:idx]
return s
Have you considered this:

|>>> z.rstrip("\0")
''abcde''
|>>> ("\0" * 20).rstrip("\0")
''''
|>>> ("x" * 20).rstrip("\0")
''xxxxxxxxxxxxxxxxxxxx''

struct.register("S", mystring_pack, mystring_unpack, mystring_len)

# then later
foo = struct.unpack("iilS", data)
============================================

This is only an example, any similar API which might fit better struct
internals would do as well.

As shown, the custom packer/unpacker can call the original pack/unpack as a
basis for their work. I guess an issue with this could be the endianess
problem: it would make sense if, when called recursively, struct.pack/unpack
used by the default the endianess specified by the external format string.



John Machin写道:
John Machin wrote:
鉴于正在进行的结构工作(我以为是死的
模块),我想知道是否可以添加API来为struct注册自定义解析代码。每当我将它用于非平凡的任务时,我总是碰巧编写小的包装函数来调整struct返回的值。

示例API如下:

========================================== ==
def mystring_len():
返回20
def mystring_pack(s):
如果len(s)> 20:
提高ValueError,一个mystring可以最多20个字符
s =(s +\0" * 20)[:20]
你有没有考虑过s .ljust(20," \0")?
given the ongoing work on struct (which I thought was a dead
module), I was wondering if it would be possible to add an API to
register custom parsing codes for struct. Whenever I use it for
non-trivial tasks, I always happen to write small wrapper functions
to adjust the values returned by struct.

An example API would be the following:

============================================
def mystring_len():
return 20

def mystring_pack(s):
if len(s) > 20:
raise ValueError, "a mystring can be at max 20 chars"
s = (s + "\0"*20)[:20]
Have you considered s.ljust(20, "\0") ?




对。这恰好是一个例子......



Right. This happened to be an example...

s = struct.pack(" 20s",s)
return s
s = struct.pack("20s", s)
return s



我是个白痴,所以请温柔地对待我:我不明白你为什么要使用struct.pack:



I am an idiot, so please be gentle with me: I don''t understand why you
are using struct.pack at all:




因为我希望能够使用自定义

格式解析最大的二进制数据块。你错过了我的信息的全部内容:


struct.unpack(" 3liiSiiShh",data)


你需要struct.unpack ()解析这些数据,你需要自定义

packer / unpacker以避免对unpack()的输出进行后处理,因为它只需知道基本的Python类型。在二进制结构中,碰巧有*类型*

,它们没有将1:1映射到Python类型,也不是基本的C类型(比如

的struct支持)。使用自定义格式化程序可以更好地表示这些类型(而不是将它们映射到最相似类型,然后

后处理它)。


在我的例子中,S是基本类型,它是A 0终止的20字节字符串,
并以结构格式用单字母S表示它。在我的代码中比使用20s更有意义。每次发生这种情况时,都会对生成的

字符串进行后期处理。



Because I want to be able to parse largest chunks of binary datas with custom
formatting. Did you miss the whole point of my message:

struct.unpack("3liiSiiShh", data)

You need struct.unpack() to parse these datas, and you need custom
packer/unpacker to avoid post-processing the output of unpack() just because it
just knows of basic Python types. In binary structs, there happen to be *types*
which do not map 1:1 to Python types, nor they are just basic C types (like the
ones struct supports). Using custom formatter is a way to better represent
these types (instead of mapping them to the "most similar" type, and then
post-process it).

In my example, "S" is a basic-type which is a "A 0-terminated 20-byte string",
and expressing it in the struct format with the single letter "S" is more
meaningful in my code than using "20s" and then post-processing the resulting
string each and every time this happens.

> import struct
> x =(" abcde" +" \0" * 20)[:20]
> x''abcde \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00' > len(x)20> y = struct.pack(" 20s",x)
> y == x True>
对我来说,看起来像是一个大胖子;你自己完成了所有繁重的工作。
> import struct
> x = ("abcde" + "\0" * 20)[:20]
> x ''abcde\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 \x00\x00\x00\x00''> len(x) 20> y = struct.pack("20s", x)
> y == x True>
Looks like a big fat no-op to me; you''ve done all the heavy lifting
yourself.




看起来你完全误读了我的信息。你的字符串x是我在
二进制数据中找到的,我需要*解包*到一个常规的Python字符串中,这将是b / b $ ab bde"



Looks like you totally misread my message. Your string "x" is what I find in
binary data, and I need to *unpack* into a regular Python string, which would
be "abcde".

idx = s.find(" \0")
如果idx> = 0:
s = s [:idx]
return s
idx = s.find("\0")
if idx >= 0:
s = s[:idx]
return s



你有没有考虑过这个:
z.rstrip(" \0")



Have you considered this:
z.rstrip("\0")


''abcde' '


''abcde''



这不起作用,因为在我必须解析的实际二进制数据中,只有第一个\0是有意义的并且终止字符串的
像在C)。

绝对不能保证其余的填充也是由\0s组成的。

-

Giovanni Bajo


This would not work because, in the actual binary data I have to parse, only
the first \0 is meaningful and terminates the string (like in C). There is
absolutely no guarantees that the rest of the padding is made of \0s as well.
--
Giovanni Bajo


Giovanni Bajo写道:
Giovanni Bajo wrote:
你需要struct.unpack()来解析这些数据,你需要自定义
打包器/ unpacker以避免对unpack()的输出进行后处理,因为它只知道基本的Python类型。在二进制结构中,碰巧是* types *没有将1:1映射到Python类型,也不是它们只是基本的C类型(就像struct支持的那样)。使用自定义
格式化程序是一种更好地表示这些类型的方法(而不是将它们映射到最相似类型,然后对其进行后处理)。

我的例子,S是一种基本类型,它是A 0终止的20字节串,并用结构格式表示单个字母S字母S。在我的代码中比使用20s更有意义。每次发生这种情况时都会对生成的字符串进行后处理。
You need struct.unpack() to parse these datas, and you need custom
packer/unpacker to avoid post-processing the output of unpack() just
because it just knows of basic Python types. In binary structs, there
happen to be *types* which do not map 1:1 to Python types, nor they
are just basic C types (like the ones struct supports). Using custom
formatter is a way to better represent these types (instead of
mapping them to the "most similar" type, and then post-process it).

In my example, "S" is a basic-type which is a "A 0-terminated 20-byte
string", and expressing it in the struct format with the single
letter "S" is more meaningful in my code than using "20s" and then
post-processing the resulting string each and every time this happens.



另一个引人注目的例子是SSH协议:
http://www.openssh.com/txt/draft-iet .. .tecture-12.txt

转到第4部分,SSH协议中使用的数据类型表示,它

描述了使用的数据类型通过SSH协议。在一个完美的世界里,我会给b
写一些自定义的打包器/解包器用于那些结构不具有b $ b处理的类型(比如mpint格式),这样我可以用struct来解析

并编写SSH消息。我最终做的是从头开始编写一个新模块

sshstruct.py,这复制了struct的工作,只因为我没有扩展struct。一些例子:


client.py:cookie,server_algorithms,guess,reserverd =

sshstruct.unpack(" 16b10LBu",data [1:])

client.py:promptts = sshstruct.unpack(" sssu" +" sB" * num_prompts,

pkt [1:])

connection.py:ptk = sshstruct.pack(" busB",SSH_MSG_CHANNEL_REQUEST,

self.recipient_number,type,reply)+ custom

kex.py:self .P,self.G = sshstruct.unpack(" mm",pkt [1:])


注意例如s如何是一个SSH字符串并直接解压缩到一个Python

字符串和m字符串。是一个SSH mpint(无限精度整数),但直接解析为
到Python中。使用struct.unpack()这可能是不可能的,并且需要进行大量的后处理。


实际上,结构应该支持覆盖的另一件事SSH协议

(以及许多其他二进制协议)是解析大小为

的字符串的能力,这些字符串在导入时是未知的(可变长度数据类型)。例如,键入

" string" SSH协议中的字符串前缀为uint32。所以

它的实际大小取决于每个实例。出于这个原因,我的sshstruct确实没有相当于struct.calcsize()的
。我想如果有一种方法可以扩展struct,它会理解可变大小的数据类型(和calcsize()

将返回-1或引发异常)。

-

Giovanni Bajo


Another compelling example is the SSH protocol:
http://www.openssh.com/txt/draft-iet...tecture-12.txt
Go to section 4, "Data Type Representations Used in the SSH Protocols", and it
describes the data types used by the SSH protocol. In a perfect world, I would
write some custom packers/unpackers for those types which struct does not
handle already (like the "mpint" format), so that I could use struct to parse
and compose SSH messages. What I ended up doing was writing a new module
sshstruct.py from scratch, which duplicates struct''s work, just because I
couldn''t extend struct. Some examples:

client.py: cookie, server_algorithms, guess, reserverd =
sshstruct.unpack("16b10LBu", data[1:])
client.py: prompts = sshstruct.unpack("sssu" + "sB"*num_prompts,
pkt[1:])
connection.py: pkt = sshstruct.pack("busB", SSH_MSG_CHANNEL_REQUEST,
self.recipient_number, type, reply) + custom
kex.py: self.P, self.G = sshstruct.unpack("mm",pkt[1:])

Notice for instance how "s" is a SSH string and unpacks directly to a Python
string, and "m" is a SSH mpint (infinite precision integer) but unpacks
directly into a Python long. Using struct.unpack() this would have been
impossible and would have required much post-processing.

Actually, another thing that struct should support to cover the SSH protocol
(and many other binary protocols) is the ability to parse strings whose size is
not known at import-time (variable-length data types). For instance, type
"string" in the SSH protocol is a string prepended with its size as uint32. So
it''s actual size depends on each instance. For this reason, my sshstruct did
not have the equivalent of struct.calcsize(). I guess that if there''s a way to
extend struct, it would comprehend variable-size data types (and calcsize()
would return -1 or raise an exception).
--
Giovanni Bajo


这篇关于struct:类型注册?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆