Cython未定义符号,带C包装器 [英] Cython undefined symbol with c wrapper

查看:61
本文介绍了Cython未定义符号,带C包装器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我尝试使用另一个cython模块的c文件中定义的函数时,我试图向cython公开C代码并遇到未定义符号错误。

在我的h文件和使用手动包装器的功能可以正常工作。

I am trying to expose c code to cython and am running into "undefined symbol" errors when trying to use functions defined in my c file from another cython module.
Functions defined in my h files and functions using a manual wrapper work without a problem.

这个问题时,但是解决方案(链接到库)对我来说并不令人满意。

我假设我在 setup.py 脚本中丢失了某些内容?

Basically the same case as this question but the solution (Linking against the library) isn't satisfactory for me.
I assume i am missing something in the setup.py script ?



我的案例的最小示例:


Minimized example of my case:

foo.h

int source_func(void);

inline int header_func(void){
    return 1;
}

foo.c

#include "foo.h"

int source_func(void){
    return 2;
}



foo_wrapper.pxd

cdef extern from "foo.h":
    int source_func()
    int header_func()

cdef source_func_wrapper()

foo_wrapper.pyx >

cdef source_func_wrapper():
    return source_func()



我想在以下功能中使用的cython模块:

test_lib.pyx

cimport foo_wrapper

def do_it():
    print "header func"
    print foo_wrapper.header_func() # ok
    print "source func wrapped"
    print foo_wrapper.source_func_wrapper() # ok    
    print "source func"
    print foo_wrapper.source_func() # undefined symbol: source_func



setup.py 构建 foo_wrapper test_lib

from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize

# setup wrapper
setup(
    ext_modules = cythonize([
        Extension("foo_wrapper", ["foo_wrapper.pyx", "foo.c"])
    ])
)

# setup test module 
setup(
    ext_modules = cythonize([
        Extension("test_lib", ["test_lib.pyx"])
    ])
)


推荐答案

foo_wrapper 中有3种不同类型的函数:

There are 3 different types of function in foo_wrapper:


  1. source_func_wrapper 是一个python函数,并且python运行时处理该函数的调用。

  2. header_func 是一个内联函数,在编译时使用,因此以后不需要定义/机器代码。

  3. source_func 必须由static处理(这是cas foo_wrapper 中的e)或动态链接(我想这是您对 test_lib 的希望)链接器。

  1. source_func_wrapper is a python-function and python run-time handles the calling of this function.
  2. header_func is an inline-function which is used at compile time, so its definition/machine code is not needed later on.
  3. source_func on the other hand must be handled by static (this is the case in foo_wrapper) or dynamic (I assume this is your wish for test_lib) linker.

进一步,我将尝试解释一下,为什么该设置不能立即使用,但是我首先要介绍两个(至少我认为)最好的选择:

Further down I'll try to explain, why the setup doesn't not work out of the box, but fist I would like to introduce two (at least in my opinion) best alternatives :

A:完全避免此问题。您的 foo_wrapper 包装 foo.h 中的c函数。这意味着所有其他模块都应使用这些包装函数。如果每个人都可以直接访问功能-这将使整个包装器过时。将 foo.h 界面隐藏在您的`pyx文件中:

A: avoid this problem altogether. Your foo_wrapper wraps c-functions from foo.h. That means every other module should use these wrapper-functions. If everyone just can access the functionality directly - this makes the whole wrapper kind of obsolete. Hide the foo.h interface in your `pyx-file:

#foo_wrapper.pdx
cdef source_func_wrapper()
cdef header_func_wrapper()


#foo_wrapper.pyx
cdef extern from "foo.h":
    int source_func()
    int header_func()

cdef source_func_wrapper():
    return source_func()
cdef header_func_wrapper():

B:可能希望通过c函数直接使用foo函数是有效的。在这种情况下,我们应该使用与cython相同的策略以及 stdc ++ -library: foo.cpp 应该成为共享库而且应该只有一个 foo.pdx 文件(没有pyx!),可以通过 cimport 导入该文件。此外,随后应将 libfoo.so 作为依赖添加到 test_lib

B: It might be valid to want to use the foo-functionality directly via c-functions. In this case we should use the same strategy as cython with stdc++-library: foo.cpp should become a shared library and there should be only a foo.pdx-file (no pyx!) which can be imported via cimport wherever needed. Additionally, libfoo.so should then be added as dependency to both foo_wrapper and test_lib.

但是,靠近 B 意味着更加忙碌-您需要放置 libfoo.so 在动态加载程序可以找到的位置...

However, approach B means more hustle - you need to put libfoo.so somewhere the dynamic loader can find it...

其他替代方法:

我们将看到,有很多方法可以获取 foo_wrapper + test_lib 即可正常工作。首先,让我们更详细地了解如何在python中加载动态库。

As we will see, there are a lot of ways to get foo_wrapper+test_lib to work. First, let's see in more detail, how loading of dynamic libraries works in python.

我们首先看一下 test_lib.so。

>>> nm test_lib.so --undefined
....
   U PyXXXXX
   U source_func

有许多未定义的符号,其中大多数以 Py 开头,并且将在运行时由python可执行文件提供。但是还有一个邪恶的人- source_func

there are a lot of undefined symbols most of which start with Py and will be provided by a python executable during the runtime. But also there is our evildoer - source_func.

现在,我们通过

LD_DEBUG=libs,files,symbols python

并通过 import test_lib 加载扩展程序。在触发的调试-trace中,我们可以看到以下内容:

and load our extension via import test_lib. In the triggered debug -trace we can see the following:

>>>>: file=./test_lib.so [0];  dynamically loaded by python [0]

python加载 test_lib.so 通过 dlopen 并开始从 test_lib.so 查找/解析未定义的符号:

python loads test_lib.so via dlopen and starts to look-up/resolve the undefined symbols from test_lib.so:

>>>>:  symbol=PyExc_RuntimeError;  lookup in file=python [0]
>>>>:  symbol=PyExc_TypeError;  lookup in file=python [0]

这些python符号很快找到了-它们都在python-executable-动态链接器首先查看(如果此可执行文件与 -Wl,-export-dynamic 链接)。但这与 source_func 不同:

these python symbols are found pretty quickly - they are all defined in the python-executable - the first place dynamic linker looks at (if this executable was linked with -Wl,-export-dynamic). But it is different with source_func:

 >>>>: symbol=source_func;  lookup in file=python [0]
 >>>>: symbol=source_func;  lookup in file=/lib/x86_64-linux-gnu/libpthread.so.0 [0]
  ...
 >>>>: symbol=source_func;  lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
 >>>>:  ./test_lib.so: error: symbol lookup error: undefined symbol: source_func (fatal)

因此,在查找所有已加载的共享库后,找不到符号,我们必须中止。有趣的事实是,尚未加载 foo_wrapper ,因此无法在此处查找 source_func 。在下一步中以 test_lib 的依赖关系被python加载)。

So after looking up all loaded shared libraries the symbol is not found and we have to abort. The fun fact is, that foo_wrapper is not yet loaded, so the source_func cannot be looked up there (it would be loaded in the next step as dependency of test_lib by python).

如果我们以预加载 foo_wrapper.so

  LD_DEBUG=libs,files,symbols LD_PRELOAD=$(pwd)/foo_wrapper.so python

这一次,调用 import test_lib 成功,因为预加载的 foo_wrapper 是动态加载程序查找符号的第一个位置(在python可执行文件之后):

this time, calling import test_lib succeed, because preloaded foo_wrapper is the first place the dynamic loader looks up the symbols (after the python-executable):

  >>>>: symbol=source_func;  lookup in file=python [0]
  >>>>: symbol=source_func;  lookup in file=/home/ed/python_stuff/cython/two/foo_wrapper.so [0]

但是,如果未预装 foo_wrapper.so ,它如何工作?首先,我们将 foo_wrapper.so 作为库添加到我们的 test_lib 设置中:

But how does it work, when foo_wrapper.so is not preloaded? First let's add foo_wrapper.so as library to our setup of test_lib:

ext_modules = cythonize([
    Extension("test_lib", ["test_lib.pyx"], 
              libraries=[':foo_wrapper.so'], 
              library_dirs=['.'],
    )])   

这将导致以下链接器命令:

this would lead to the following linker command:

 gcc ... test_lib.o -L. -l:foo_wrapper.so -o test_lib.so

如果我们现在查找符号,那么我们看到没有区别:

If we now look up the symbols, so we see no difference:

>>> nm test_lib.so --undefined
....
   U PyXXXXX
   U source_func

source_func 仍未定义!那么,链接共享库有什么好处?所不同的是,现在已根据需要列出 foo_wrapper.so test_lib.so

source_func is still undefined! So what is the advantage of linking against the shared library? The difference is, that now foo_wrapper.so is listed as needed in for test_lib.so:

>>>> readelf -d test_lib.so| grep NEEDED
0x0000000000000001 (NEEDED)             Shared library: [foo_wrapper.so]
0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]

ld 不链接,这是动态链接程序的工作,但它会进行试运行并通过注意 foo_wrapper.so 来帮助动态链接程序来解析符号,因此必须在开始搜索符号之前加载它。但是,它没有明确表示必须在 foo_wrapper.so 中查找符号 source_func -我们实际上可以找到

ld does not link, this is a job of the dynamic linker, but it does a dry run and help dynamic linker by noting, that foo_wrapper.so is needed in order to resolve the symbols, so it must be loaded before the search of the symbols starts. However, it does not explicitly say, that the symbol source_func must be looked in foo_wrapper.so - we could actually find it and use it anywhere.

让我们再次启动python,这次没有预加载:

Lets start python again, this time without preloading:

  >>>> LD_DEBUG=libs,files,symbols python
  >>>> import test_lib
  ....
  >>>> file=./test_lib.so [0];  dynamically loaded by python [0]....
  >>>> file=foo_wrapper.so [0];  needed by ./test_lib.so [0]
  >>>> find library=foo_wrapper.so [0]; searching
  >>>> search cache=/etc/ld.so.cache
  .....
  >>>> `foo_wrapper.so: cannot open shared object file: No such file or directory.

好吧,现在动态链接器知道了,它必须找到 foo_wrapper.so ,但是路径不通,所以我们会收到错误消息。

Ok, now the dynamic linker knows, it has to find foo_wrapper.so but it is nowhere in the path, so we get an error message.

我们必须告诉动态链接器在哪里寻找共享库。有很多方法,其中一种是设置 LD_LIBRARY_PATH

We have to tell dynamic linker where to look for the shared libraries. There is a lot of ways, one of them is to set LD_LIBRARY_PATH:

 LD_DEBUG=libs,symbols,files LD_LIBRARY_PATH=. python
 >>>> import test_lib
 ....
 >>>> find library=foo_wrapper.so [0]; searching
 >>>> search path=./tls/x86_64:./tls:./x86_64:.     (LD_LIBRARY_PATH) 
 >>>> ...
 >>>> trying file=./foo_wrapper.so
 >>>> file=foo_wrapper.so [0];  generating link map

这次 foo_wrapper.so 找到(动态加载程序查看了 LD_LIBRARY_PATH 所提示的位置),然后加载并用于解析 test_lib.so

This time foo_wrapper.so is found (dynamic loader looked at places hinted at by LD_LIBRARY_PATH), loaded and then used for resolving the undefined symbols in test_lib.so.

但是,如果使用 runtime_library_dirs -setup参数,有什么区别?

But what is the difference, if runtime_library_dirs-setup argument is used?

 ext_modules = cythonize([
    Extension("test_lib", ["test_lib.pyx"], 
              libraries=[':foo_wrapper.so'], 
              library_dirs=['.'],               
              runtime_library_dirs=['.']
             )
])

然后调用

 LD_DEBUG=libs,symbols,files python
 >>>> import test_lib
 ....
 >>>> file=foo_wrapper.so [0];  needed by ./test_lib.so [0]
 >>>> find library=foo_wrapper.so [0]; searching
 >>>> search path=./tls/x86_64:./tls:./x86_64:.     (RPATH from file ./test_lib.so)
 >>>>     trying file=./foo_wrapper.so
 >>>>   file=foo_wrapper.so [0];  generating link map

foo_wrapper.so 即使没有通过 LD_LIBRARY_PATH 进行设置,也可以在所谓的 RPATH 上运行。我们可以看到此 RPATH 由静态链接器插入:

foo_wrapper.so is found on a so called RPATH even if not set via LD_LIBRARY_PATH. We can see this RPATH being inserted by the static linker:

  >>>> readelf -d test_lib.so | grep RPATH
        0x000000000000000f (RPATH)              Library rpath: [.]

但这是路径相对于当前工作目录,这在大多数情况下不是所需要的。应该通过绝对路径或使用

however this is the path relative to the current working directory, which is most of the time not what is wanted. One should pass an absolute path or use

   ext_modules = cythonize([
              Extension("test_lib", ["test_lib.pyx"], 
              libraries=[':foo_wrapper.so'],
              library_dirs=['.'],                   
              extra_link_args=["-Wl,-rpath=$ORIGIN/."] #rather than runtime_library_dirs
             )
])

相对于生成的共享库的当前位置(例如,可以通过复制/移动来更改)。 readelf 现在显示:

to make the path relative to current location (which can change for example through copying/moving) of the resultingshared library. readelf shows now:

>>>> readelf -d test_lib.so | grep RPATH
     0x000000000000000f (RPATH)              Library rpath: [$ORIGIN/.]

所需的共享库将相对于已加载的共享库的路径进行搜索,即 test_lib.so

which means the needed shared library will be searched relatively to the path of the loaded shared library, i.e test_lib.so.

如果您想重复使用 foo_wrapper.so 中的符号(我也不主张),那么设置也应该如此。

That is also how your setup should be, if you would like to reuse the symbols from foo_wrapper.so which I do not advocate.

但是,有些使用已构建库的可能性。

There are however some possibilities to use the libraries you have already built.

让我们回到原始状态建立。如果我们首先导入 foo_wrapper (作为一种预加载),然后才导入 test_lib ,会发生什么?即:

Let's go back to original setup. What happens if we first import foo_wrapper (as a kind of preload) and only then test_lib? I.e.:

 >>>> import foo_wrapper
 >>>>> import test_lib

开箱即用。但为什么?显然, foo_wrapper 加载的符号对其他库不可见。 Python使用 dlopen 来动态加载共享库,如这篇好文章,可能有一些不同的策略。我们可以使用

This doesn't work out of the box. But why? Obviously, the loaded symbols from foo_wrapper are not visible to other libraries. Python uses dlopen for dynamical loading of shared libraries, and as explained in this good article, there are some different strategies possible. We can use

 >>>> import sys
 >>>> sys.getdlopenflags() 
 >>>> 2

查看设置了哪些标志。 2 表示 RTLD_NOW ,这意味着在加载共享库时直接解析符号。我们需要使用 RTLD_GLOBAL = 256 进行OR标志,以使符号在全局/动态加载库的外部可见。

to see which flags are set. 2 means RTLD_NOW, which means that the symbols are resolved directly upon the loading of the shared library. We need to OR flag withRTLD_GLOBAL=256 to make the symbols visible globally/outside of the dynamically loaded library.

>>> import sys; import ctypes;
>>> sys.setdlopenflags(sys.getdlopenflags()| ctypes.RTLD_GLOBAL)
>>> import foo_wrapper
>>> import test_lib

它可以正常工作,我们的调试跟踪显示:

and it works, our debug trace shows:

>>> symbol=source_func;  lookup in file=./foo_wrapper.so [0]
>>> file=./foo_wrapper.so [0];  needed by ./test_lib.so [0] (relocation dependency)

另一个有趣的细节: foo_wrapper.so 被加载一次,因为python不会通过import foo_wrapper 两次加载一个模块。但是,即使将其打开两次,它在内存中也只会出现一次(第二次只读只会增加共享库的引用计数)。

Another interesting detail: foo_wrapper.so is loaded once, because python does not load a module twice via import foo_wrapper. But even if it would be opened twice, it would be only once in the memory (the second read only increases the reference count of the shared library).

但是现在有了赢得了见识,我们甚至可以走得更远:

But now with won insight we could even go further:

 >>>> import sys;
 >>>> sys.setdlopenflags(1|256)#RTLD_LAZY+RTLD_GLOBAL
 >>>> import test_lib
 >>>> test_lib.do_it()
 >>>> ... it works! ....

为什么? RTLD_LAZY 意味着符号不会在加载时直接解析,而是在首次使用时解析。但是在第一次使用( test_lib.do_it())之前,先加载 foo_wrapper (在<$ c $内部导入c> test_lib 模块),由于 RTLD_GLOBAL 的原因,其符号可在以后用于解析。

Why this? RTLD_LAZY means that the symbols are resolved not directly upon the loading but when they are used for the first time. But before the first usage (test_lib.do_it()), foo_wrapper is loaded (import inside of test_lib module) and due to RTLD_GLOBAL its symbols can be used for resolving later on.

如果我们不使用 RTLD_GLOBAL ,则只有在调用 test_lib.do_it()时才会失败,因为在这种情况下,无法全局看到 foo_wrapper 所需的符号。

If we don't use RTLD_GLOBAL, the failure comes only when we call test_lib.do_it(), because the needed symbols from foo_wrapper are not seen globally in this case.

这个问题,为什么只链接两个模块 foo_wrapper test_lib 并不是一个好主意针对 foo.cpp :单例,请参见

To the question, why it is not such a great idea just to link both modules foo_wrapper and test_lib against foo.cpp: Singletons, see this.

这篇关于Cython未定义符号,带C包装器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆