使用ctypes/cffi解决循环共享对象依赖性 [英] Resolving circular shared-object dependencies with ctypes/cffi
问题描述
我想使用cffi
(如果需要,甚至使用ctypes
)从Linux上的Python 3访问C ABI.该API由许多.so
文件(称为libA.so
,libB.so
和libC.so
)实现,因此libA
包含主要的导出函数,其他库提供对libA
的支持.
现在,libA
取决于libB
,而libB
取决于libC
.但是,有一个问题. libC
期望存在一个由libA
定义的全局数组.因此libC
实际上依赖于libA
-循环依赖关系.尝试使用等效于dlopen
的cffi或ctags加载libA
会导致libB
和libC
中缺少符号,但是尝试首先加载libC
会导致有关丢失数组的错误(在libA
).
由于它是变量而不是函数,因此RTLD_LAZY选项似乎在这里不适用.
奇怪的是,ldd libA.so
没有将libB
或libC
显示为依赖项,因此我不确定这是否是问题的一部分.我想这依赖于与这些库链接的任何程序来明确指定它们.
有办法解决这个问题吗?一种想法是创建一个依赖于libA
,libB
和libC
的新共享对象(例如,"all.so"),以便dlopen("all.so")
可以一次性加载所需的所有内容,但是我可以也不能使它正常工作.
处理这种情况的最佳策略是什么?实际上,我要访问的ABI很大,可能有20-30个共享对象文件.
(如果我正确理解了问题,)这是 Nix 上的一个完全正常的用例,应该可以正常运行.>
[cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ls
code.py defines.h libA.c libA.h libB.c libB.h libC.c libC.h
[cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> gcc -fPIC -shared -o libC.so libC.c
[cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> gcc -fPIC -shared -o libB.so libB.c -L. -lC
[cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> gcc -fPIC -shared -o libA.so libA.c -L. -lB
[cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ls
code.py defines.h libA.c libA.h libA.so libB.c libB.h libB.so libC.c libC.h libC.so
[cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> LD_LIBRARY_PATH=. ldd libC.so
linux-vdso.so.1 => (0x00007ffdfb1f4000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f56dcf23000)
/lib64/ld-linux-x86-64.so.2 (0x00007f56dd4ef000)
[cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> LD_LIBRARY_PATH=. ldd libB.so
linux-vdso.so.1 => (0x00007ffc2e7fd000)
libC.so => ./libC.so (0x00007fdc90a9a000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fdc906d0000)
/lib64/ld-linux-x86-64.so.2 (0x00007fdc90e9e000)
[cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> LD_LIBRARY_PATH=. ldd libA.so
linux-vdso.so.1 => (0x00007ffd20d53000)
libB.so => ./libB.so (0x00007fdbee95a000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fdbee590000)
libC.so => ./libC.so (0x00007fdbee38e000)
/lib64/ld-linux-x86-64.so.2 (0x00007fdbeed5e000)
[cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> nm -S libC.so | grep charArray
U charArray
[cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> nm -S libA.so | grep charArray
0000000000201030 0000000000000003 D charArray
[cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> LD_LIBRARY_PATH=. python3 code.py
Python 3.5.2 (default, Nov 12 2018, 13:43:14)
[GCC 5.4.0 20160609] on linux
From C: [libA.c] (9) - [funcA]
From C: [libB.c] (7) - [funcB]
From C: [libC.c] (7) - [funcC]
0 - A
1 - B
2 - C
funcA returned 3
但是,如果您的数组声明为 static ((> CPPReference]:C关键字:static )(因此,它不能像示例中那样是 extern ),然后就有点烤面包了.
@ EDIT0 :扩展示例,使其更适合描述.
由于 ldd 不显示 .so 之间的依赖关系,因此我将假定每个都是动态加载的.
-
utils.h :
#pragma once #include <dlfcn.h> void *loadLib(char id);
-
utils.c :
#include "defines.h" #include "utils.h" void *loadLib(char id) { PRINT_MSG_0(); char libNameFormat[] = "lib%c.so"; char libName[8]; sprintf(libName, libNameFormat, id); int load_flags = RTLD_LAZY | RTLD_GLOBAL; // !!! @TODO - @CristiFati: Note RTLD_LAZY: if RTLD_NOW would be here instead, there would be nothing left to do. Same thing if RTLD_GLOBAL wouldn't be specified. !!! void *ret = dlopen(libName, load_flags); if (ret == NULL) { char *err = dlerror(); printf("Error loading lib (%s): %s\n", libName, (err != NULL) ? err : "(null)"); } return ret; }
下面是 libB.c 的修改版本.请注意,相同的模式也应应用于原始的 libA.c .
-
libB.c :
#include "defines.h" #include "libB.h" #include "libC.h" #include "utils.h" size_t funcB() { PRINT_MSG_0(); void *mod = loadLib('C'); size_t ret = funcC(); dlclose(mod); return ret; }
输出:
[cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ls code.py defines.h libA.c libA.h libB.c libB.h libC.c libC.h utils.c utils.h [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> gcc -fPIC -shared -o libC.so libC.c utils.c [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> gcc -fPIC -shared -o libB.so libB.c utils.c [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> gcc -fPIC -shared -o libA.so libA.c utils.c [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ls code.py defines.h libA.c libA.h libA.so libB.c libB.h libB.so libC.c libC.h libC.so utils.c utils.h [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ldd libA.so linux-vdso.so.1 => (0x00007ffe5748c000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f4d9e3f6000) /lib64/ld-linux-x86-64.so.2 (0x00007f4d9e9c2000) [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ldd libB.so linux-vdso.so.1 => (0x00007ffe22fe3000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe93ce8a000) /lib64/ld-linux-x86-64.so.2 (0x00007fe93d456000) [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ldd libC.so linux-vdso.so.1 => (0x00007fffe85c3000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f2d47453000) /lib64/ld-linux-x86-64.so.2 (0x00007f2d47a1f000) [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> nm -S libC.so | grep charArray U charArray [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> nm -S libA.so | grep charArray 0000000000201060 0000000000000003 D charArray [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> LD_LIBRARY_PATH=. python3 code.py Python 3.5.2 (default, Nov 12 2018, 13:43:14) [GCC 5.4.0 20160609] on linux Traceback (most recent call last): File "code.py", line 22, in <module> main() File "code.py", line 12, in main lib_a = CDLL(DLL) File "/usr/lib/python3.5/ctypes/__init__.py", line 347, in __init__ self._handle = _dlopen(self._name, mode) OSError: ./libA.so: undefined symbol: funcB
我相信这会重现问题.现在,如果您将 code.py 的1 st 部分修改为:
#!/usr/bin/env python3
import sys
from ctypes import CDLL, \
RTLD_GLOBAL, \
c_size_t
RTLD_LAZY = 0x0001
DLL = "./libA.so"
def main():
lib_a = CDLL(DLL, RTLD_LAZY | RTLD_GLOBAL)
func_a = lib_a.funcA
func_a.restype = c_size_t
ret = func_a()
print("{:s} returned {:d}".format(func_a.__name__, ret))
if __name__ == "__main__":
print("Python {:s} on {:s}\n".format(sys.version, sys.platform))
main()
您将获得以下输出:
[cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> LD_LIBRARY_PATH=. python3 code.py Python 3.5.2 (default, Nov 12 2018, 13:43:14) [GCC 5.4.0 20160609] on linux From C: [libA.c] (11) - [funcA] From C: [utils.c] (6) - [loadLib] From C: [libB.c] (8) - [funcB] From C: [utils.c] (6) - [loadLib] From C: [libC.c] (7) - [funcC] 0 - A 1 - B 2 - C funcA returned 3
注释:
- 在 C
RTLD_LAZY | RTLD_GLOBAL
中存在非常重要.如果将 RTLD_LAZY 替换为 RTLD_NOW ,则它将不起作用- 此外,如果未指定 RTLD_GLOBAL ,它也将不起作用.我没有检查是否还可以指定其他 RTLD _ 标志来代替 RTLD_GLOBAL 来使事情仍然有效
- 创建一个处理所有库加载和初始化的包装器库是一件好事(变通方法),尤其是如果您计划在多个地方使用它们(那样,整个过程只会在一个地方发生).但是,以前的项目符号仍将适用
- 由于某些原因, ctypes 不会公开 RTLD_LAZY (事实上,还有许多其他相关标志).在 code.py 中定义它是一种解决方法,并且在不同的( Nix )平台(风味)上,其值可能会有所不同
I would like to use cffi
(or even ctypes
if I must) to access a C ABI from Python 3 on Linux. The API is implemented by a number of .so
files (let's call them libA.so
, libB.so
and libC.so
), such that libA
contains the main exported functions, and the other libs provide support for libA
.
Now, libA
depends on libB
and libB
depends on libC
. However, there's a problem. There's a global array defined by libA
that libC
expects to be present. So libC
actually depends on libA
- a circular dependency. Trying to use cffi or ctags equivalent to dlopen
to load libA
results in missing symbols from libB
and libC
, but trying to load libC
first results in an error about the missing array (which is in libA
).
Since it's a variable, rather than a function, the RTLD_LAZY option doesn't seem to apply here.
Oddly, ldd libA.so
doesn't show libB
or libC
as dependencies so I'm not sure if that's part of the problem. I suppose that relies on any program that links with these libraries to explicitly specify them all.
Is there a way to get around this? One idea was to create a new shared object (say, "all.so") that is dependent on libA
, libB
and libC
so that dlopen("all.so")
might load everything it needs in one go, but I can't get this to work either.
What's the best strategy to handle this situation? In reality, the ABI I'm trying to access is pretty large, with perhaps 20-30 shared object files.
This (if I understood the problem correctly,) is a perfectly normal usecase on Nix, and should run without problems.
When dealing with problems related to ctypes ([Python 3]: ctypes - A foreign function library for Python), the best (generic) way to tackle them is:
- Write a (small) C application that does the required job (and of course, works)
- Only then move to ctypes (basically this is translating the above application)
I prepared a small (and dummy) example:
defines.h:
#pragma once #include <stdio.h> #define PRINT_MSG_0() printf("From C: [%s] (%d) - [%s]\n", __FILE__, __LINE__, __FUNCTION__)
libC:
libC.h:
#pragma once size_t funcC();
libC.c:
#include "defines.h" #include "libC.h" #include "libA.h" size_t funcC() { PRINT_MSG_0(); for (size_t i = 0; i < ARRAY_DIM; i++) { printf("%zu - %c\n", i, charArray[i]); } printf("\n"); return ARRAY_DIM; }
libB:
libB.h:
#pragma once size_t funcB();
libB.c:
#include "defines.h" #include "libB.h" #include "libC.h" size_t funcB() { PRINT_MSG_0(); return funcC(); }
libA:
libA.h:
#pragma once #define ARRAY_DIM 3 extern char charArray[ARRAY_DIM]; size_t funcA();
libA.c:
#include "defines.h" #include "libA.h" #include "libB.h" char charArray[ARRAY_DIM] = {'A', 'B', 'C'}; size_t funcA() { PRINT_MSG_0(); return funcB(); }
code.py:
#!/usr/bin/env python3 import sys from ctypes import CDLL, \ c_size_t DLL = "./libA.so" def main(): lib_a = CDLL(DLL) func_a = lib_a.funcA func_a.restype = c_size_t ret = func_a() print("{:s} returned {:d}".format(func_a.__name__, ret)) if __name__ == "__main__": print("Python {:s} on {:s}\n".format(sys.version, sys.platform)) main()
Output:
[cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ls code.py defines.h libA.c libA.h libB.c libB.h libC.c libC.h [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> gcc -fPIC -shared -o libC.so libC.c [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> gcc -fPIC -shared -o libB.so libB.c -L. -lC [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> gcc -fPIC -shared -o libA.so libA.c -L. -lB [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ls code.py defines.h libA.c libA.h libA.so libB.c libB.h libB.so libC.c libC.h libC.so [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> LD_LIBRARY_PATH=. ldd libC.so linux-vdso.so.1 => (0x00007ffdfb1f4000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f56dcf23000) /lib64/ld-linux-x86-64.so.2 (0x00007f56dd4ef000) [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> LD_LIBRARY_PATH=. ldd libB.so linux-vdso.so.1 => (0x00007ffc2e7fd000) libC.so => ./libC.so (0x00007fdc90a9a000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fdc906d0000) /lib64/ld-linux-x86-64.so.2 (0x00007fdc90e9e000) [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> LD_LIBRARY_PATH=. ldd libA.so linux-vdso.so.1 => (0x00007ffd20d53000) libB.so => ./libB.so (0x00007fdbee95a000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fdbee590000) libC.so => ./libC.so (0x00007fdbee38e000) /lib64/ld-linux-x86-64.so.2 (0x00007fdbeed5e000) [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> nm -S libC.so | grep charArray U charArray [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> nm -S libA.so | grep charArray 0000000000201030 0000000000000003 D charArray [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> LD_LIBRARY_PATH=. python3 code.py Python 3.5.2 (default, Nov 12 2018, 13:43:14) [GCC 5.4.0 20160609] on linux From C: [libA.c] (9) - [funcA] From C: [libB.c] (7) - [funcB] From C: [libC.c] (7) - [funcC] 0 - A 1 - B 2 - C funcA returned 3
But if your array is declared as static ([CPPReference]: C keywords: static) (and thus, as a consequence it can't be extern as in the example), then you're kind of toasted.
@EDIT0: Extending the example so that it better fits the description.
Since ldd doesn't show dependencies between the .sos, I'm going to assume that each is loaded dynamically.
utils.h:
#pragma once #include <dlfcn.h> void *loadLib(char id);
utils.c:
#include "defines.h" #include "utils.h" void *loadLib(char id) { PRINT_MSG_0(); char libNameFormat[] = "lib%c.so"; char libName[8]; sprintf(libName, libNameFormat, id); int load_flags = RTLD_LAZY | RTLD_GLOBAL; // !!! @TODO - @CristiFati: Note RTLD_LAZY: if RTLD_NOW would be here instead, there would be nothing left to do. Same thing if RTLD_GLOBAL wouldn't be specified. !!! void *ret = dlopen(libName, load_flags); if (ret == NULL) { char *err = dlerror(); printf("Error loading lib (%s): %s\n", libName, (err != NULL) ? err : "(null)"); } return ret; }
Below is a modified version of libB.c. Note that the same pattern should also be applied to the original libA.c.
libB.c:
#include "defines.h" #include "libB.h" #include "libC.h" #include "utils.h" size_t funcB() { PRINT_MSG_0(); void *mod = loadLib('C'); size_t ret = funcC(); dlclose(mod); return ret; }
Output:
[cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ls code.py defines.h libA.c libA.h libB.c libB.h libC.c libC.h utils.c utils.h [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> gcc -fPIC -shared -o libC.so libC.c utils.c [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> gcc -fPIC -shared -o libB.so libB.c utils.c [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> gcc -fPIC -shared -o libA.so libA.c utils.c [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ls code.py defines.h libA.c libA.h libA.so libB.c libB.h libB.so libC.c libC.h libC.so utils.c utils.h [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ldd libA.so linux-vdso.so.1 => (0x00007ffe5748c000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f4d9e3f6000) /lib64/ld-linux-x86-64.so.2 (0x00007f4d9e9c2000) [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ldd libB.so linux-vdso.so.1 => (0x00007ffe22fe3000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe93ce8a000) /lib64/ld-linux-x86-64.so.2 (0x00007fe93d456000) [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ldd libC.so linux-vdso.so.1 => (0x00007fffe85c3000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f2d47453000) /lib64/ld-linux-x86-64.so.2 (0x00007f2d47a1f000) [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> nm -S libC.so | grep charArray U charArray [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> nm -S libA.so | grep charArray 0000000000201060 0000000000000003 D charArray [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> LD_LIBRARY_PATH=. python3 code.py Python 3.5.2 (default, Nov 12 2018, 13:43:14) [GCC 5.4.0 20160609] on linux Traceback (most recent call last): File "code.py", line 22, in <module> main() File "code.py", line 12, in main lib_a = CDLL(DLL) File "/usr/lib/python3.5/ctypes/__init__.py", line 347, in __init__ self._handle = _dlopen(self._name, mode) OSError: ./libA.so: undefined symbol: funcB
I believe that this reproduces the problem. Now, if you modify (the 1st part of) code.py to:
#!/usr/bin/env python3
import sys
from ctypes import CDLL, \
RTLD_GLOBAL, \
c_size_t
RTLD_LAZY = 0x0001
DLL = "./libA.so"
def main():
lib_a = CDLL(DLL, RTLD_LAZY | RTLD_GLOBAL)
func_a = lib_a.funcA
func_a.restype = c_size_t
ret = func_a()
print("{:s} returned {:d}".format(func_a.__name__, ret))
if __name__ == "__main__":
print("Python {:s} on {:s}\n".format(sys.version, sys.platform))
main()
you'd get the following output:
[cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> LD_LIBRARY_PATH=. python3 code.py Python 3.5.2 (default, Nov 12 2018, 13:43:14) [GCC 5.4.0 20160609] on linux From C: [libA.c] (11) - [funcA] From C: [utils.c] (6) - [loadLib] From C: [libB.c] (8) - [funcB] From C: [utils.c] (6) - [loadLib] From C: [libC.c] (7) - [funcC] 0 - A 1 - B 2 - C funcA returned 3
Notes:
- It's very important that in C
RTLD_LAZY | RTLD_GLOBAL
are there. if RTLD_LAZY is replaced by RTLD_NOW, it won't work- Also, if RTLD_GLOBAL isn't specified, it won't work either. I didn't check whether there are other RTLD_ flags that could be specified instead of RTLD_GLOBAL for things to still work
- Creating that wrapper library that deals with all libraries loading and initialization, would be a good thing (workaround), especially if you plan to use them from multiple places (that way, the whole process would happen in one place only). But, previous bullet would still apply
- For some reason, ctypes doesn't expose RTLD_LAZY (and many other related flags, as a matter of fact). Defining it in the code.py, is kind of a workaround, and on different (Nix) platforms (flavors), its value might differ
这篇关于使用ctypes/cffi解决循环共享对象依赖性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!