Python:struct和array与ctypes类似的功能 [英] Python: Similar functionality in struct and array vs ctypes

查看:78
本文介绍了Python:struct和array与ctypes类似的功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Python提供了以下三个模块来处理C类型以及如何处理它们:

Python provides the following three modules that deal with C types and how to handle them:

  • struct 用于C结构
  • array 用于诸如此类的数组在C
  • ctypes 用于C函数,其中必然需要处理C的类型系统

虽然 ctypes 似乎比 struct array 更通用,更灵活(其主要任务是"Python的外部函数库"),当任务是读取二进制数据结构时,这三个模块之间在功能上似乎有很大的重叠.例如,如果我想读取C结构

While ctypes seems more general and flexible (its main task being "a foreign function library for Python") than struct and array, there seems to be significant overlap in functionality between these three modules when the task is to read binary data structures. For example, if I wanted to read a C struct

struct MyStruct {
    int a;
    float b;
    char c[12];
};

我可以按如下方式使用 struct :

I could use struct as follows:

a, b, c = struct.unpack('if12s', b'\x11\0\0\0\x12\x34\x56\x78hello world\0')
print(a, b, c)
# 17 1.7378244361449504e+34 b'hello world\x00'

另一方面,使用 ctypes 效果同样好(尽管有些冗长):

On the other hand, using ctypes works equally well (although a bit more verbose):

 class MyStruct(ctypes.Structure):
     _fields_ = [
         ('a', ctypes.c_int),
         ('b', ctypes.c_float),
         ('c', ctypes.c_char * 12)
     ]
 s = MyStruct.from_buffer_copy(b'\x11\0\0\0\x12\x34\x56\x78hello world\0')
 print(s.a, s.b, s.c)
 # 17 1.7378244361449504e+34 b'hello world'

(此外:我确实不知道结尾的'\ 0'在此版本中的位置…)

(Aside: I do wonder where the trailing '\0' went in this version, though…)

在我看来,这似乎违反了《 The Zen of Python》中的原则:

This seems to me like it violates the principles in "The Zen of Python":

  1. 应该有一种(最好只有一种)明显的方式.

那么,使用类似的几个模块进行二进制数据处理时,这种情况是如何产生的呢?有历史或实际原因吗?(例如,我可以想象完全省略 struct 模块,而只是添加一个更方便的API来将C结构读/写到 ctypes .)

So how did this situation with several of these similar modules for binary data handling arise? Is there a historical or practical reason? (For example, I could imagine omitting the struct module entirely and simply adding a more convenient API for reading/writing C structs to ctypes.)

推荐答案

免责声明:这篇推测是基于我对Python stdlib中分工"的理解,而不是基于事实的可参考信息.

Disclaimer: this post is speculation based on my understanding of the "division of labor" in Python stdlib, not on factual referenceable info.

您的问题源于以下事实:"C结构"和二进制数据"往往可以互换使用,尽管在实践中是正确的,但从技术意义上讲是错误的. struct 文档也具有误导性:它声称可以在"C structs"上工作,而更好的描述是"binary data",其中有一些关于C兼容性的免责声明.

Your question stems from the fact that "C structs" and "binary data" tend to be used interchangeably, which, while correct in practice, is wrong in a technical sense. The struct documentation is also misleading: it claims to work on "C structs", while a better description would be "binary data", with some disclaimers about C compatibility.

从根本上说, struct array ctypes 做不同的事情. struct 处理将Python值转换为二进制内存格式. array 处理有效存储大量值的问题. ctypes 处理C语言 (*).功能上的重叠源于这样的事实:对于C,二进制内存格式"是本机的,并且有效地存储值" 将它们包装到C形数组中.

Fundamentally, struct, array and ctypes do different things. struct deals with converting Python values into binary in-memory formats. array deals with efficiently storing a lot of values. ctypes deals with the C language(*). The overlap in functionality stems from the fact that for C, the "binary in-memory formats" are native, and that "efficiently storing values" is packing them into a C-like array.

您还将注意到, struct 可让您轻松指定字节顺序,因为它以多种可以打包的方式处理二进制数据的打包和拆包.而在 ctypes 中,获取非本地字节顺序会更加困难,因为它使用的是C 固有的字节顺序.

You will also note that struct lets you easily specify endianness, because it deals with packing and unpacking binary data in many different ways it can be packed; while in ctypes it is more difficult to get non-native byte order, because it uses the byte order that is native to C.

如果您的任务是读取二进制数据结构,则抽象级别不断提高:

If your task is reading binary data structures, there's increasing levels of abstraction:

  1. 手动拆分字节数组,并使用 int.from_bytes 之类的东西
  2. 进行转换
  3. 使用格式字符串描述数据,并使用 struct 一次性解压缩
  4. 使用类似 Construct 之类的库,以逻辑方式声明性地描述该结构.
  1. Manually splitting the byte array and converting parts with int.from_bytes and the like
  2. Describing the data with a format string and using struct to unpack in one go
  3. Using a library like Construct to describe the structure declaratively in logical terms.

ctypes 甚至在这里都没有用,因为对于此任务,使用 ctypes 几乎要遍历不同的编程语言.它对您的示例同样有效的事实是偶然的;它之所以起作用,是因为C本身就适合于表达打包二进制数据的许多方式.但是,例如,如果您的结构是混合字节序的,则很难用 ctypes 表示.另一个示例是没有C等效项的半精度浮点数(请参见这里).

ctypes don't even figure here, because for this task, using ctypes is pretty much taking a round-trip through a different programming language. The fact that it works just as well for your example is incidental; it works because C is natively suited to expressing many ways of packing binary data. But if your struct was mixed-endian, for instance, it would be very difficult to express in ctypes. Another example is half-precision float which doesn't have a C equivalent (see here).

从这个意义上讲, ctypes 使用 struct 是非常合理的-毕竟,打包和解压缩二进制数据"是与C接口"的子任务.

In this sense, it's also very reasonable that ctypes use struct - after all, "packing and unpacking binary data" is a subtask of "interfacing with C".

另一方面,对于 struct 使用 ctypes 是没有意义的:就像使用 email 库来存储字符对转换进行编码,因为这是电子邮件库可以完成的任务.

On the other hand, it would make no sense for struct to use ctypes: it would be like using the email library for character encoding conversions because it's a task that an e-mail library can do.

(*)好,基本上.更精确的是类似基于C的环境"的东西,即现代计算机由于与C作为主要系统语言的共同进化而在低级工作.

(*) well, basically. More precise would be something like "C-based environments", i.e., how modern computers work on low level due to co-evolution with C as the primary systems language.

这篇关于Python:struct和array与ctypes类似的功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆