Ld魔术地覆盖静态链接的符号 [英] Ld magically overrides statically linked symbols

查看:166
本文介绍了Ld魔术地覆盖静态链接的符号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

几天后,我们处理了一个很奇怪的问题。

For a few days we are dealing with very strange problem.

我不明白这是怎么发生的 - 当第三方我们的共享库,它以某种方式覆盖我们的一些符号(提高,精确)与它自己的。这些符号是静态链接和(!!)本地。

I can't understand how it even happens - when a third-party (MATLAB) program uses our shared library, it somehow overrides some of our symbols (boost, to be precise) with it's own. Those symbols are statically linked and (!!) local.

这里是交易 - 我们使用boost 1.47,MATLAB有boost 1.40。目前,库调用segfaults从我们的图书馆到他们的提升(regex)。

Here is the deal - we use boost 1.47, MATLAB has boost 1.40. Currently, library call segfaults on a call from OUR library to their boost (regex).

因此,这里是神奇的:


  • 我们没有库依赖项,ldd:


    linux-vdso.so.1 =>  (0x00007fff4abff000)
    libpthread.so.0 => /lib/libpthread.so.0 (0x00007f1a3fd65000)
    libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007f1a3fa51000)
    libm.so.6 => /lib/libm.so.6 (0x00007f1a3f7cd000)
    libgomp.so.1 => /usr/lib/libgomp.so.1 (0x00007f1a3f5bf000)
    libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007f1a3f3a8000)
    libc.so.6 => /lib/libc.so.6 (0x00007f1a3f024000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f1a414f9000)
    librt.so.1 => /lib/librt.so.1 (0x00007f1a3ee1c000)




  • 没有Cxx符号(我们的公共符号是POC C用于二进制兼容性)从我们的库nm导出:

  • 
    nm -g --defined-only libmysharedlib.so
    
    addr1 T OurCSymbol1
    addr2 T OurCSymbol2
    addr3 T OurCSymbol3
    ...
    




    • 。怎么样? Stacktrace(路径切割):

    • 
      [  0] 0x00007f21fddbb0a9 bin/libmwfl.so+00454825 fl::sysdep::linux::unwind_stack(void const**, unsigned long, unsigned long, fl::diag::thread_context const&)+000009
      [  1] 0x00007f21fdd74111 bin/glnxa64/libmwfl.so+00164113 fl::diag::stacktrace_base::capture(fl::diag::thread_context const&, unsigned long)+000161
      [  2] 0x00007f21fdd7d42d bin/glnxa64/libmwfl.so+00201773
      [  3] 0x00007f21fdd7d6b4 bin/glnxa64/libmwfl.so+00202420 fl::diag::terminate_log(char const*, fl::diag::thread_context const&, bool)+000100
      [  4] 0x00007f21fce525a7 bin/glnxa64/libmwmcr.so+00365991
      [  5] 0x00007f21fb9eb8f0 lib/libpthread.so.0+00063728
      [  6] 0x00007f21f3e939a9 libboost_regex.so.1.40.0+00342441 boost::re_detail::perl_matcher, std::allocator > >, boost::regex_traits > >::match_all_states()+000073
      [  7] 0x00007f21f3eb6546 bin/glnxa64/libboost_regex.so.1.40.0+00484678 boost::re_detail::perl_matcher, std::allocator > >, boost::regex_traits > >::match_imp()+000758
      [  8] 0x00007f21c04ad595 lib/libmysharedlib.so+04855189 bool boost::regex_match, std::allocator > >, char, boost::regex_traits > >(__gnu_cxx::__normal_iterator, __gnu_cxx::__normal_iterator, boost::match_results, std::allocator > > >&, boost::basic_regex > > const&, boost::regex_constants::_match_flags)+000245
      [  9] 0x00007f21c04a71c7 lib/libmysharedlib.so+04829639 myfunc2()+000183
      [ 10] 0x00007f21c01b41e3 lib/libmysharedlib.so+01737187 myfunc1()+000307
      

      众所周知,MATLAB只与RTLD_NOW标志dlopen。

      It's known, that MATLAB does dlopen with RTLD_NOW flag only.

      人们,请与我一起思考。
      现在我绝对不能解决这个问题,而只是为了理解ld和elf行为。

      People, think with me please. Now i'm desperate not to even fix this, but to simply understand ld&elf behavior.

      编辑:
      小问题:我明白,没有特殊的链接器选项,在linux .so库中的符号从来不通过地址链接?因此,即使是静态链接的局部符号在运行时解决了吗?

      edit: Small additional question: how i understood, without special linker options, symbols in linux .so libraries are never linked by address? So even statically linked local symbols are resolved in runtime?

      推荐答案

      查看 -Bsymbolic

      Check out the -Bsymbolic option for ld.

      如果指定 -Bsymbolic 在创建共享
      对象 ld 时,将尝试将共享库中的全局符号的引用绑定到定义
      。默认是延迟绑定到运行时。

      If -Bsymbolic is specified, then at the time of creating a shared object ld will attempt to bind references to global symbols to definitions within the shared library. The default is to defer binding to runtime.

      这可以用一个例子更清楚。

      This may be clearer with an example.

      example.o 包含对
      global.o

      $ nm example.o | grep ' U'
           U _GLOBAL_OFFSET_TABLE_
           U globalfn
      $ nm global.o | grep ' T'
      00000000 T globalfn
      

      和两个共享对象 normal.so symbolic.so ,建立为
      如下:

      and two shared objects, normal.so and symbolic.so, are built as follows:

      $ cc -fPIC -c example.c
      $ cc -c global.c
      $ rm -f archive.a; ar cr archive.a global.o
      $ ld -shared -o normal.so example.o archive.a
      $ ld -Bsymbolic -shared -o symbolic.so example.o archive.a
      

      反汇编 normal.so 的代码显示调用
      globalfn 实际上是通过过程链接表,
      因此调用的最终目的地是在运行时确定的。

      Disassembling the code for normal.so shows that the call to globalfn is actually going through the procedure linkage table, and thus the final destination of the call is determined at runtime.

      $ objdump --disassemble normal.so
      ...snip...
      00000194 <example>:
      ...snip...
       1a6:   e8 d9 ff ff ff          call   184 <globalfn@plt>
      ...snip...
      $ readelf -r normal.so
      
      Relocation section '.rel.plt' at offset 0x16c contains 1 entries:
      Offset     Info    Type            Sym.Value  Sym. Name
      00001244  00000207 R_386_JUMP_SLOT   000001b8   globalfn
      

      而在 symbolic.so ,调用总是调用共享对象中的
      globalfn 的定义。

      Whereas in symbolic.so, the call always invokes the definition of globalfn within the shared object.

      $ objdump --disassemble symbolic.so
      ...snip...
      0000016c <shared>:
      ...snip...
       17e:   e8 0d 00 00 00          call   190 <globalfn>
      ...snip...
      $ readelf -r symbolic.so
      
      There are no relocations in this file.
      

      这篇关于Ld魔术地覆盖静态链接的符号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆