Ld魔术地覆盖静态链接的符号 [英] Ld magically overrides statically linked symbols
问题描述
几天后,我们处理了一个很奇怪的问题。
For a few days we are dealing with very strange problem.
我不明白这是怎么发生的 - 当第三方我们的共享库,它以某种方式覆盖我们的一些符号(提高,精确)与它自己的。这些符号是静态链接和(!!)本地。
I can't understand how it even happens - when a third-party (MATLAB) program uses our shared library, it somehow overrides some of our symbols (boost, to be precise) with it's own. Those symbols are statically linked and (!!) local.
这里是交易 - 我们使用boost 1.47,MATLAB有boost 1.40。目前,库调用segfaults从我们的图书馆到他们的提升(regex)。
Here is the deal - we use boost 1.47, MATLAB has boost 1.40. Currently, library call segfaults on a call from OUR library to their boost (regex).
因此,这里是神奇的:
- 我们没有库依赖项,ldd:
linux-vdso.so.1 => (0x00007fff4abff000)
libpthread.so.0 => /lib/libpthread.so.0 (0x00007f1a3fd65000)
libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007f1a3fa51000)
libm.so.6 => /lib/libm.so.6 (0x00007f1a3f7cd000)
libgomp.so.1 => /usr/lib/libgomp.so.1 (0x00007f1a3f5bf000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007f1a3f3a8000)
libc.so.6 => /lib/libc.so.6 (0x00007f1a3f024000)
/lib64/ld-linux-x86-64.so.2 (0x00007f1a414f9000)
librt.so.1 => /lib/librt.so.1 (0x00007f1a3ee1c000)
- 没有Cxx符号(我们的公共符号是POC C用于二进制兼容性)从我们的库nm导出:
nm -g --defined-only libmysharedlib.so
addr1 T OurCSymbol1
addr2 T OurCSymbol2
addr3 T OurCSymbol3
...
- 。怎么样? Stacktrace(路径切割):
[ 0] 0x00007f21fddbb0a9 bin/libmwfl.so+00454825 fl::sysdep::linux::unwind_stack(void const**, unsigned long, unsigned long, fl::diag::thread_context const&)+000009
[ 1] 0x00007f21fdd74111 bin/glnxa64/libmwfl.so+00164113 fl::diag::stacktrace_base::capture(fl::diag::thread_context const&, unsigned long)+000161
[ 2] 0x00007f21fdd7d42d bin/glnxa64/libmwfl.so+00201773
[ 3] 0x00007f21fdd7d6b4 bin/glnxa64/libmwfl.so+00202420 fl::diag::terminate_log(char const*, fl::diag::thread_context const&, bool)+000100
[ 4] 0x00007f21fce525a7 bin/glnxa64/libmwmcr.so+00365991
[ 5] 0x00007f21fb9eb8f0 lib/libpthread.so.0+00063728
[ 6] 0x00007f21f3e939a9 libboost_regex.so.1.40.0+00342441 boost::re_detail::perl_matcher, std::allocator > >, boost::regex_traits > >::match_all_states()+000073
[ 7] 0x00007f21f3eb6546 bin/glnxa64/libboost_regex.so.1.40.0+00484678 boost::re_detail::perl_matcher, std::allocator > >, boost::regex_traits > >::match_imp()+000758
[ 8] 0x00007f21c04ad595 lib/libmysharedlib.so+04855189 bool boost::regex_match, std::allocator > >, char, boost::regex_traits > >(__gnu_cxx::__normal_iterator, __gnu_cxx::__normal_iterator, boost::match_results, std::allocator > > >&, boost::basic_regex > > const&, boost::regex_constants::_match_flags)+000245
[ 9] 0x00007f21c04a71c7 lib/libmysharedlib.so+04829639 myfunc2()+000183
[ 10] 0x00007f21c01b41e3 lib/libmysharedlib.so+01737187 myfunc1()+000307
众所周知,MATLAB只与RTLD_NOW标志dlopen。
It's known, that MATLAB does dlopen with RTLD_NOW flag only.
人们,请与我一起思考。
现在我绝对不能解决这个问题,而只是为了理解ld和elf行为。
People, think with me please. Now i'm desperate not to even fix this, but to simply understand ld&elf behavior.
编辑:
小问题:我明白,没有特殊的链接器选项,在linux .so库中的符号从来不通过地址链接?因此,即使是静态链接的局部符号在运行时解决了吗?
edit: Small additional question: how i understood, without special linker options, symbols in linux .so libraries are never linked by address? So even statically linked local symbols are resolved in runtime?
推荐答案
查看 -Bsymbolic $
Check out the -Bsymbolic
option for ld.
如果指定 -Bsymbolic
在创建共享
对象 ld 时,将尝试将共享库中的全局符号的引用绑定到定义
。默认是延迟绑定到运行时。
If -Bsymbolic
is specified, then at the time of creating a shared
object ld will attempt to bind references to global symbols to definitions
within the shared library. The default is to defer binding to runtime.
这可以用一个例子更清楚。
This may be clearer with an example.
说 example.o
包含对
global.o
,
$ nm example.o | grep ' U'
U _GLOBAL_OFFSET_TABLE_
U globalfn
$ nm global.o | grep ' T'
00000000 T globalfn
和两个共享对象 normal.so
和 symbolic.so
,建立为
如下:
and two shared objects, normal.so
and symbolic.so
, are built as
follows:
$ cc -fPIC -c example.c
$ cc -c global.c
$ rm -f archive.a; ar cr archive.a global.o
$ ld -shared -o normal.so example.o archive.a
$ ld -Bsymbolic -shared -o symbolic.so example.o archive.a
反汇编 normal.so
的代码显示调用
globalfn
实际上是通过过程链接表,
因此调用的最终目的地是在运行时确定的。
Disassembling the code for normal.so
shows that the call to
globalfn
is actually going through the procedure linkage table, and
thus the final destination of the call is determined at runtime.
$ objdump --disassemble normal.so
...snip...
00000194 <example>:
...snip...
1a6: e8 d9 ff ff ff call 184 <globalfn@plt>
...snip...
$ readelf -r normal.so
Relocation section '.rel.plt' at offset 0x16c contains 1 entries:
Offset Info Type Sym.Value Sym. Name
00001244 00000207 R_386_JUMP_SLOT 000001b8 globalfn
而在 symbolic.so
,调用总是调用共享对象中的
globalfn
的定义。
Whereas in symbolic.so
, the call always invokes the definition of
globalfn
within the shared object.
$ objdump --disassemble symbolic.so
...snip...
0000016c <shared>:
...snip...
17e: e8 0d 00 00 00 call 190 <globalfn>
...snip...
$ readelf -r symbolic.so
There are no relocations in this file.
这篇关于Ld魔术地覆盖静态链接的符号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!