将 str 作为 int 数组传递给 Python C 扩展函数(使用 SWIG 扩展) [英] Pass str as an int array to a Python C extended function (extended using SWIG)

查看:45
本文介绍了将 str 作为 int 数组传递给 Python C 扩展函数(使用 SWIG 扩展)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何将使用 python 代码获得的 str 值(包含 3000 {'0', '1'} 个字节)作为参数传递给 python c 扩展函数(使用 SWIG 扩展)需要 int *(定长整数数组)作为输入参数?我的代码是这样的:

int *exposekey(int *bits) {int a[1000];for (int j=2000; j <3000; j++) {a[j - 2000] = 位[j];}返回一个;}

我尝试过的是使用 ctypes(见下面的代码):

import ctypesldpc = ctypes.cdll.LoadLibrary('./_ldpc.so')arr = (ctypes.c_int * 3072)(<下面提到>)ldpc.exposekey(arr)

在位置输入了 3072 {0, 1}.Python 返回语法错误:超过 255 个参数.这仍然不能帮助我传递分配的 str 值而不是初始化的 ctypes int 数组.

其他建议包括使用 SWIG 类型映射,但是如何将 str 转换为 int * ?提前致谢.

解决方案

关于我的评论,这里有一些关于从函数返回数组的更多细节:[SO]: 使用 C 返回一个数组.简而言之:处理这个的方法:

  1. 使返回的变量静态
  2. 动态分配(使用malloc(家庭)或new)
  3. 将其转换为函数的附加参数

让那段 C 代码在 Python 解释器中运行有两种方法:

因为他们都在做同样的事情,所以将它们混合在一起是没有意义的.因此,请选择最适合您的需求.


1.ctypes

  • 这就是你的开始
  • 这是一种使用ctypes
  • 做事的方式

ctypes_demo.c:

#include #如果已定义(_WIN32)# 定义 CTYPES_DEMO_EXPORT_API __declspec(dllexport)#别的# 定义 CTYPES_DEMO_EXPORT_API#万一CTYPES_DEMO_EXPORT_API int 暴露密钥(char *bitsIn, char *bitsOut) {int ret = 0;printf("来自 C 代码的消息...\n");for (int j = 0; j <1000; j++){bitsOut[j] = bitsIn[j + 2000];ret++;}返回 ret;}

注意事项:

  • 根据注释,我将函数中的类型从 int* 更改为 char*,因为它紧凑了 4 倍(虽然仍然是 ~700% 效率低下,因为每个字符的 7 位被忽略,而只使用其中一个;这可以修复,但需要按位处理)
  • 我把a变成了第二个nd参数(bitsOut).我认为这是最好的,因为分配和释放数组是调用者的责任(从一开始的 3rd 选项)
  • 我还修改了索引范围(不改变功能),因为使用低索引值并在一个地方向它们添加一些东西更有意义,而不是高索引值并在另一个地方减去(相同)一些东西地点
  • 返回值是设置的位数(在本例中显然是 1000)但这只是一个示例
  • printf 这只是个伪代码,用来显示 C 代码被执行
  • 在处理此类数组时,建议也传递它们的维度,以避免越界错误.此外,错误处理也是一个重要方面

test_ctypes.py:

from ctypes import CDLL, c_char, c_char_p, c_int, create_string_bufferbits_string = "010011000110101110101110101010010111011101101010101"定义主():dll = CDLL("./ctypes_demo.dll")暴露密钥 = dll.exposekey暴露key.argtypes = [c_char_p, c_char_p]暴露密钥.restype = c_intbits_in = create_string_buffer(b"\0" * 2000 + bits_string.encode())bits_out = create_string_buffer(1000)print("之前:[{}]".format(bits_out.raw[:len(bits_string)].decode()))ret = 暴露密钥(bits_in,bits_out)print("After: [{}]".format(bits_out.raw[:len(bits_string)].decode()))print("返回代码:{}".format(ret))如果 __name__ == "__main__":主要的()

注意事项:

  • 1st,我想提一下,运行你的代码并没有引发你得到的错误
  • 指定函数的 argtypesrestype 强制性的,并且还使事情变得更容易(记录在 ctypes 教程中)
  • 我正在打印 bits_out 数组(只有第一个和相关的部分,其余都是 0)以证明 C 代码完成了它的工作
  • 我在开始时用 2000 个虚拟 0 初始化 bits_in 数组,因为这些值在这里不相关.此外,输入字符串 (bits_string) 的长度不是 3000 个字符(原因很明显).如果你的 bits_string 是 3000 个字符,你可以简单地初始化 bits_in 像:bits_in = create_string_buffer(bits_string.encode())
  • 不要忘记bits_out初始化为一个足够大的数组(在我们的示例中为1000)以满足其目的,否则segfault尝试将其内容设置为超过大小时可能会出现
  • 对于这个(简单)函数,ctypes 变体更容易(至少对我来说,因为我不经常使用 swig),但对于更复杂的函数/预计这将成为一种矫枉过正,改用 swig 是正确的做法

输出(在 Win 上使用 Python3.5 运行):

<块引用>

c:\Work\Dev\StackOverflow\q47276327>"c:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" test_ctypes.py前: [                                                   ]来自 C 代码的消息...之后:[010011000110101110101110101010010111011101101010101]返回码:1000


2.swig

  • 几乎 ctypes 部分中的所有内容,也适用于此

swig_demo.c:

#include #include #include "swig_demo.h"char *exposekey(char *bitsIn) {char *bitsOut = (char*)malloc(sizeof(char) * 1000);printf("来自 C 代码的消息...\n");for (int j = 0; j <1000; j++) {bitsOut[j] = bitsIn[j + 2000];}返回位输出;}

swig_demo.i:

%module swig_demo%{#include "swig_demo.h"%}%newobject 公开密钥;%include "swig_demo.h"

swig_demo.h:

char *exposekey(char *bitsIn);

注意事项:

  • 在这里我分配数组并返回它(从一开始的 2nd 选项)
  • .i 文件是一个标准的 swig 接口文件
    • 定义模块及其导出(通过%include)
    • 值得一提的是 %newobject 指令,它释放 exposekey 返回的指针以避免内存泄漏
  • .h 文件只包含函数声明,以便被 .i 文件包含(这不是强制性的,但这样更优雅)
  • 其他都差不多

test_swig.py:

from swig_demo import Exposurekeybits_in = "010011000110101110101110101010010111011101101010101"定义主():位输出 = 暴露密钥(\0" * 2000 + 位输入)print("C 函数返回:[{}]".format(bits_out))如果 __name__ == "__main__":主要的()

注意事项:

  • Python 程序员的PoV
  • 代码要短得多(这是因为 swig 在幕后做了一些魔法"):
    • .i 文件生成的包装器 .c 包装器文件有 ~120K
    • swig_demo.py 生成的模块有 ~3K
  • 我在字符串的开头使用了与 2000 0 相同的技术

输出:

<块引用>

c:\Work\Dev\StackOverflow\q47276327>"c:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" test_swig.py来自 C 代码的消息...C 函数返回:[010011000110101110101110101010010111011101101010101]


3.普通Python C API

  • 我添加了这部分作为个人练习
  • 这就是swig所做的,但是手动"

capi_demo.c:

#include "Python.h"#include "swig_demo.h"#define MOD_NAME "capi_demo"静态 PyObject *PyExposekey(PyObject *self, PyObject *args) {PyObject *bitsInArg = NULL,*bitsOutArg = NULL;字符 *bitsIn = NULL,*bitsOut = NULL;if (!PyArg_ParseTuple(args, "O", &bitsInArg))返回空;bitsIn = PyBytes_AS_STRING(PyUnicode_AsEncodedString(bitsInArg, "ascii", "strict"));位输出 = 暴露密钥(位输入);bitsOutArg = PyUnicode_FromString(bitsOut);免费(位输出);返回 bitsOutArg;}静态 PyMethodDef 模块方法[] = {{"exposekey", (PyCFunction)PyExposekey, METH_VARARGS, NULL},{空值}};静态结构 PyModuleDef moduleDef = {PyModuleDef_HEAD_INIT、MOD_NAME、NULL、-1、moduleMethods};PyMODINIT_FUNC PyInit_capi_demo(void) {返回 PyModule_Create(&moduleDef);}

注意事项:

  • 它需要 swig_demo.hswig_demo.c(这里不会复制它们的内容)
  • 适用于Python 3(实际上我在让它工作时遇到了很多麻烦,特别是因为我习惯于PyString_AsString不再存在)
  • 错误处理很差
  • test_capi.pytest_swig.py 类似,但有一个(明显的)区别:from swig_demo import Exposurekey 应替换为 从 capi_demo 导入公开密钥
  • 输出也与 test_swig.py 相同(这里不再重复)

How can I pass a str value (containing 3000 {'0', '1'} bytes) obtained using python code as an argument to a python c extended function (extended using SWIG) which requires int * (fixed length int array) as an input argument? My code is such:

int *exposekey(int *bits) {
    int a[1000];
    for (int j=2000; j < 3000; j++) {
        a[j - 2000] = bits[j];
    }
    return a;
}

What I've tried was to use ctypes (see below code):

import ctypes
ldpc = ctypes.cdll.LoadLibrary('./_ldpc.so')
arr = (ctypes.c_int * 3072)(<mentioned below>)
ldpc.exposekey(arr)

with 3072 {0, 1} entered in the position. Python returns syntax error : more than 255 arguments. This still doesn't help me to pass assigned str value instead of the initialized ctypes int array.

Other suggestion included using SWIG typemaps but how would that work for converting a str into int * ? Thanks in advance.

解决方案

Regarding my comment, here are some more details about returning arrays from functions: [SO]: Returning an array using C. In short: ways handle this:

  1. Make the returned variable static
  2. Dynamically allocate it (using malloc (family) or new)
  3. Turn it into an additional argument for the function

Getting that piece of C code to run within the Python interpreter is possible in 2 ways:

Since they both are doing the same thing, mixing them together makes no sense. So, pick the one that best fits your needs.


1. ctypes

  • This is what you started with
  • It's one of the ways of doing things using ctypes

ctypes_demo.c:

#include <stdio.h>

#if defined(_WIN32)
#  define CTYPES_DEMO_EXPORT_API __declspec(dllexport)
#else
#  define CTYPES_DEMO_EXPORT_API
#endif


CTYPES_DEMO_EXPORT_API int exposekey(char *bitsIn, char *bitsOut) {
    int ret = 0;
    printf("Message from C code...\n");
    for (int j = 0; j < 1000; j++)
    {
        bitsOut[j] = bitsIn[j + 2000];
        ret++;
    }
    return ret;
}

Notes:

  • Based on comments, I changed the types in the function from int* to char*, because it's 4 times more compact (although it's still ~700% inefficient since 7 bits of each char are ignored versus only one of them being used; that can be fixed, but requires bitwise processing)
  • I took a and turned into the 2nd argument (bitsOut). I think this is best because it's caller responsibility to allocate and deallocate the array (the 3rd option from the beginning)
  • I also modified the index range (without changing functionality), because it makes more sense to work with low index values and add something to them in one place, instead of a high index values and subtract (the same) something in another place
  • The return value is the number of bits set (obviously, 1000 in this case) but it's just an example
  • printf it's just dummy, to show that the C code gets executed
  • When dealing with such arrays, it's recommended to pass their dimensions as well, to avoid out of bounds errors. Also, error handling is an important aspect

test_ctypes.py:

from ctypes import CDLL, c_char, c_char_p, c_int, create_string_buffer


bits_string = "010011000110101110101110101010010111011101101010101"


def main():
    dll = CDLL("./ctypes_demo.dll")
    exposekey = dll.exposekey

    exposekey.argtypes = [c_char_p, c_char_p]
    exposekey.restype = c_int

    bits_in = create_string_buffer(b"\0" * 2000 + bits_string.encode())
    bits_out = create_string_buffer(1000)
    print("Before: [{}]".format(bits_out.raw[:len(bits_string)].decode()))
    ret = exposekey(bits_in, bits_out)
    print("After: [{}]".format(bits_out.raw[:len(bits_string)].decode()))
    print("Return code: {}".format(ret))


if __name__ == "__main__":
    main()

Notes:

  • 1st, I want to mention that running your code didn't raise the error you got
  • Specifying function's argtypes and restype is mandatory, and also makes things easier (documented in the ctypes tutorial)
  • I am printing the bits_out array (only the first - and relevant - part, as the rest are 0) in order to prove that the C code did its job
  • I initialize bits_in array with 2000 dummy 0 at the beginning, as those values are not relevant here. Also, the input string (bits_string) is not 3000 characters long (for obvious reasons). If your bits_string is 3000 characters long you can simply initialize bits_in like: bits_in = create_string_buffer(bits_string.encode())
  • Do not forget to initialize bits_out to an array with a size large enough (in our example 1000) for its purpose, otherwise segfault might arise when trying to set its content past the size
  • For this (simple) function, the ctypes variant was easier (at least for me, since I don't use swig frequently), but for more complex functions / projects it will become an overkill and switching to swig would be the right thing to do

Output (running with Python3.5 on Win):

c:\Work\Dev\StackOverflow\q47276327>"c:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" test_ctypes.py
Before: [                                                   ]
Message from C code...
After: [010011000110101110101110101010010111011101101010101]
Return code: 1000


2. swig

  • Almost everything from the ctypes section, applies here as well

swig_demo.c:

#include <malloc.h>
#include <stdio.h>
#include "swig_demo.h"


char *exposekey(char *bitsIn) {
    char *bitsOut = (char*)malloc(sizeof(char) * 1000);
    printf("Message from C code...\n");
    for (int j = 0; j < 1000; j++) {
        bitsOut[j] = bitsIn[j + 2000];
    }
    return bitsOut;
}

swig_demo.i:

%module swig_demo
%{
#include "swig_demo.h"
%}

%newobject exposekey;
%include "swig_demo.h"

swig_demo.h:

char *exposekey(char *bitsIn);

Notes:

  • Here I'm allocating the array and return it (the 2nd option from the beginning)
  • The .i file is a standard swig interface file
    • Defines the module, and its exports (via %include)
    • One thing that is worth mentioning is the %newobject directive that deallocates the pointer returned by exposekey to avoid memory leaks
  • The .h file just contains the function declaration, in order to be included by the .i file (it's not mandatory, but things are more elegant this way)
  • The rest is pretty much the same

test_swig.py:

from swig_demo import exposekey

bits_in = "010011000110101110101110101010010111011101101010101"


def main():
    bits_out = exposekey("\0" * 2000 + bits_in)
    print("C function returned: [{}]".format(bits_out))


if __name__ == "__main__":
    main()

Notes:

  • Things make much more sense from Python programmer's PoV
  • Code is a lot shorter (that is because swig did some "magic" behind the scenes):
    • The wrapper .c wrapper file generated from the .i file has ~120K
    • The swig_demo.py generated module has ~3K
  • I used the same technique with 2000 0 at the beginning of the string

Output:

c:\Work\Dev\StackOverflow\q47276327>"c:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" test_swig.py
Message from C code...
C function returned: [010011000110101110101110101010010111011101101010101]


3. Plain Python C API

  • I added this part as a personal exercise
  • This is what swig does, but "manually"

capi_demo.c:

#include "Python.h"
#include "swig_demo.h"

#define MOD_NAME "capi_demo"


static PyObject *PyExposekey(PyObject *self, PyObject *args) {
    PyObject *bitsInArg = NULL, *bitsOutArg = NULL;
    char *bitsIn = NULL, *bitsOut = NULL;
    if (!PyArg_ParseTuple(args, "O", &bitsInArg))
        return NULL;
    bitsIn = PyBytes_AS_STRING(PyUnicode_AsEncodedString(bitsInArg, "ascii", "strict"));
    bitsOut = exposekey(bitsIn);
    bitsOutArg = PyUnicode_FromString(bitsOut);
    free(bitsOut);
    return bitsOutArg;
}


static PyMethodDef moduleMethods[] = {
    {"exposekey", (PyCFunction)PyExposekey, METH_VARARGS, NULL},
    {NULL}
};


static struct PyModuleDef moduleDef = {
    PyModuleDef_HEAD_INIT, MOD_NAME, NULL, -1, moduleMethods
};


PyMODINIT_FUNC PyInit_capi_demo(void) {
    return PyModule_Create(&moduleDef);
}

Notes:

  • It requires swig_demo.h and swig_demo.c (not going to duplicate their contents here)
  • It only works with Python 3 (actually I got quite some headaches making it work, especially because I was used to PyString_AsString which is no longer present)
  • Error handling is poor
  • test_capi.py is similar to test_swig.py with one (obvious) difference: from swig_demo import exposekey should be replaced by from capi_demo import exposekey
  • The output is also the same to test_swig.py (again, not going to duplicate it here)

这篇关于将 str 作为 int 数组传递给 Python C 扩展函数(使用 SWIG 扩展)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆