将FILE *从Python / ctypes传递给函数 [英] Pass FILE * into function from Python / ctypes

查看:100
本文介绍了将FILE *从Python / ctypes传递给函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个库函数(用C编写),该函数通过将输出写入 FILE * 来生成文本。我想用创建临时文件或管道,将其传递给函数,从文件中读取结果并将其作为Python字符串返回的代码包装在Python(2.7.x)中。



下面是一个简化的示例来说明我要做什么:

  / *库函数* / 
void write_numbers(FILE * f,int arg1,int arg2)
{
fprintf(f,%d%d\n,arg1,arg2);
}

Python包装器:

 从ctypes导入* 
mylib = CDLL('mylib.so')


def write_numbers(a,b):
rd,wr = os.pipe()

write_fp = MAGIC_HERE(wr)
mylib.write_numbers(write_fp,a,b)
os.close(wr)

read_file = os.fdopen(rd)
res = read_file.read()
read_file.close()

return res

#应该导致打印 1 2\n。
print write_numbers(1,2)

我想知道我的最佳选择是什么 MAGIC_HERE()



我很想只使用 ctypes 并创建 libc.fdopen() 包装器,该包装器返回Python c_void_t,然后将其传递给库函数。从理论上讲,我似乎应该是安全的–只是想知道这种方法是否存在问题,或者是否存在解决该问题的现有Python主义。



这将需要一个长期运行的过程(让我们假设永远),因此任何泄漏的文件描述符都会有问题。

解决方案

首先,请注意 FILE * 是特定于stdio的实体。它在系统级别不存在。系统级别上存在的内容是描述符(使用检索UNIX中的 file.fileno() )( os.pipe()已经返回普通描述符),并且句柄(使用 msvcrt进行检索。 Windows中的get_osfhandle() )。 如果有多个运行中的C运行时,那么,作为库间交换格式,这是一个糟糕的选择。如果您的库是针对另一个C运行时而不是副本的C运行时编译的,那将会很麻烦。 Python:1)结构的二进制布局可能有所不同(例如,由于对齐或用于调试目的的其他成员,甚至是不同的类型大小); 2)在Windows中,结构链接到的文件描述符也是C特定的实体,它们的表由C运行时在内部维护 1 传递文件。



此外,在Python 3中,为了从 stdio 解开I / O,已对其进行了全面检查。因此, FILE * 与该Python风格无关(并且很可能也是大多数非C风格)。



现在,您需要的是




  • 以某种方式猜测所需的C运行时,并且

  • 调用其 fdopen() (或等价货币)。



(Python的座右铭之一是




最干净的方法是使用链接到该库的精确实例(祈祷它是动态链接的,否则将没有导出的符号可以调用)



对于第一个项目,我找不到任何可以分析加载的Python模块动态模块的元数据,以找出已链接到哪个DLL / so(由于系统上可能存在多个库实例,仅一个名称甚至name + version都是不够的)。尽管绝对有可能,因为有关其格式的信息已广泛可用。



对于第二项,它是一个琐碎的 ctypes.cdll('path' ).fdopen (对于MSVCRT,为 _fdopen )。






第二,您可以创建一个小的帮助程序模块,该模块将针对与库相同(或保证兼容)的运行时进行编译,并为您执行上述描述符/句柄的转换。






最后,使用Python的方法是最简单(也是最肮脏的)方法通过Python C API的C运行时实例(因此,以上所有警告均完全适用),可通过 ctypes.pythonapi 。它利用了




  • Python 2的类文件对象是 stdio FILE * (不是Python 3)

  • PyFile_AsFile API,该API返回已包装的 FILE * (请注意 Python 3缺少此功能


    • 对于独立的 fd ,您需要首先构造一个类似文件的对象(这样将有一个 FILE * 返回;))


  • 对象的 id() 是其内存地址(特定于CPython) 2

      >>> open( test.txt)
    <打开文件 test.txt,模式为 r,位于0x017F8F40>
    >> f = _
    >> f.fileno()
    3
    >> ctypes.pythonapi
    < PyDLL‘python dll’,在12808b0>处处理1e000000;
    >> api = _
    >>> api.PyFile_AsFile
    < _FuncPtr对象位于0x018557B0>
    >> api.PyFile_AsFile.restype = ctypes.c_void_p#根据ctypes文档,
    #pythonapi假定所有fns
    #默认情况下返回int
    >>> api.PyFile_AsFile.argtypes =(ctypes.c_void_p,)#从2.7.10开始,长整数
    #被无声地截断为int,请参见http://bugs.python.org/issue24747
    > >> api.PyFile_AsFile(id(f))
    2019259400




请记住,使用 fd s和C指针,您需要手动确保适当的对象生存期!




  • os.fdopen()返回的类似文件的对象确实关闭了 .close()


    • 如此重复的描述符与 os.dup()如果在关闭文件对象/收集垃圾后需要它们


  • 在使用C结构时,请调整 PyFile_IncUseCount() / PyFile_DecUseCount()

  • 确保描述符/文件对象上没有其他I / O cts,因为它会破坏数据(例如自从在f 中为l调用 iter(f) / 之后,内部缓存就独立于 stdio 的缓存)


I have a library function (written in C) that generates text by writing the output to FILE *. I want to wrap this in Python (2.7.x) with code that creates a temp file or pipe, passes it into the function, reads the result from the file, and returns it as a Python string.

Here's a simplified example to illustrate what I'm after:

/* Library function */
void write_numbers(FILE * f, int arg1, int arg2)
{
   fprintf(f, "%d %d\n", arg1, arg2);
}

Python wrapper:

from ctypes import *
mylib = CDLL('mylib.so')


def write_numbers( a, b ):
   rd, wr = os.pipe()

   write_fp = MAGIC_HERE(wr)
   mylib.write_numbers(write_fp, a, b)
   os.close(wr)

   read_file = os.fdopen(rd)
   res = read_file.read()
   read_file.close()

   return res

#Should result in '1 2\n' being printed.
print write_numbers(1,2)

I'm wondering what my best bet is for MAGIC_HERE().

I'm tempted to just use ctypes and create a libc.fdopen() wrapper that returns a Python c_void_t, then pass that into the library function. I'm seems like that should be safe in theory--just wondering if there are issues with that approach or an existing Python-ism to solve this problem.

Also, this will go in a long-running process (lets just assume "forever"), so any leaked file descriptors are going to be problematic.

解决方案

First, do note that FILE* is an stdio-specific entity. It doesn't exist at system level. The things that exist at system level are descriptors (retrieved with file.fileno()) in UNIX (os.pipe() returns plain descriptors already) and handles (retrieved with msvcrt.get_osfhandle()) in Windows. Thus it's a poor choice as an inter-library exchange format if there can be more than one C runtime in action. You'll be in trouble if your library is compiled against another C runtime than your copy of Python: 1) binary layouts of the structure may differ (e.g. due to alignment or additional members for debugging purposes or even different type sizes); 2) in Windows, file descriptors that the structure links to are C-specific entities as well, and their table is maintained by a C runtime internally1.

Moreover, in Python 3, I/O was overhauled in order to untangle it from stdio. So, FILE* is alien to that Python flavor (and likely, most non-C flavors, too).

Now, what you need is to

  • somehow guess which C runtime you need, and
  • call its fdopen() (or equivalent).

(One of Python's mottoes is "make the right thing easy and the wrong thing hard", after all)


The cleanest method is to use the precise instance that the library is linked to (do pray that it's linked with it dynamically or there'll be no exported symbol to call)

For the 1st item, I couldn't find any Python modules that can analyze loaded dynamic modules' metadata to find out which DLLs/so's it have been linked with (just a name or even name+version isn't enough, you know, due to possible multiple instances of the library on the system). Though it's definitely possible since the information about its format is widely available.

For the 2nd item, it's a trivial ctypes.cdll('path').fdopen (_fdopen for MSVCRT).


Second, you can do a small helper module that would be compiled against the same (or guaranteed compatible) runtime as the library and would do the conversion from the aforementioned descriptor/handle for you. This is effectively a workaround to editing the library proper.


Finally, there's the simplest (and the dirtiest) method using Python's C runtime instance (so all the above warnings apply in full) through Python C API available via ctypes.pythonapi. It takes advantage of

  • the fact that Python 2's file-like objects are wrappers over stdio's FILE* (Python 3's are not)
  • PyFile_AsFile API that returns the wrapped FILE* (note that it's missing from Python 3)
    • for a standalone fd, you need to construct a file-like object first (so that there would be a FILE* to return ;) )
  • the fact that id() of an object is its memory address (CPython-specific)2

    >>> open("test.txt")
    <open file 'test.txt', mode 'r' at 0x017F8F40>
    >>> f=_
    >>> f.fileno()
    3
    >>> ctypes.pythonapi
    <PyDLL 'python dll', handle 1e000000 at 12808b0>
    >>> api=_
    >>> api.PyFile_AsFile
    <_FuncPtr object at 0x018557B0>
    >>> api.PyFile_AsFile.restype=ctypes.c_void_p   #as per ctypes docs,
                                             # pythonapi assumes all fns
                                             # to return int by default
    >>> api.PyFile_AsFile.argtypes=(ctypes.c_void_p,) # as of 2.7.10, long integers are
                    #silently truncated to ints, see http://bugs.python.org/issue24747
    >>> api.PyFile_AsFile(id(f))
    2019259400
    

Do keep in mind that with fds and C pointers, you need to ensure proper object lifetimes by hand!

  • file-like objects returned by os.fdopen() do close the descriptor on .close()
    • so duplicate descriptors with os.dup() if you need them after a file object is closed/garbage collected
  • while working with the C structure, adjust the corresponding object's reference count with PyFile_IncUseCount()/PyFile_DecUseCount().
  • ensure no other I/O on the descriptors/file objects since it would screw up the data (e.g. ever since calling iter(f)/for l in f, internal caching is done that's independent from stdio's caching)

这篇关于将FILE *从Python / ctypes传递给函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆