我可以使共享库构造函数在重定位之前执行吗? [英] Can I make shared library constructors execute before relocations?

查看:216
本文介绍了我可以使共享库构造函数在重定位之前执行吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

背景:我尝试实施与上一个答案中所述类似的系统。总之,我有一个应用程序链接到共享库(目前在Linux上)。我想让共享库在运行时在多个实现之间切换(例如,基于主机CPU是否支持某个指令集)。



在最简单的情况下,我有三个不同的共享库文件:




  • libtest.so :这是

  • libtest_variant.so :这是已优化版本的库,我想在运行时选择的库的变体,如果CPU支持它。它与ABI兼容 libtest.so

  • libtest_dispatch.so :这是负责选择要在运行时使用库的变体的库。



上面的链接答案,我做以下:




  • 最终应用程序链接到 libtest.so

  • 我拥有 DT_SONAME 字段libtest.so 设置为 libtest_dispatch.so 。因此,当我运行应用程序时,它会加载 libtest_dispatch.so ,而不是实际的依赖 libtest.so 。 / li>
  • libtest_dispatch.so 配置为具有类似于以下内容的构造函数:

      __ attribute __((constructor))void init()
    {
    if(can_use_variant)dlopen(libtest_variantSHLIB_EXT,RTLD_NOW | RTLD_GLOBAL);
    else dlopen(libtestSHLIB_EXT,RTLD_NOW | RTLD_GLOBAL);
    }

    调用 dlopen()将加载提供适当实现的共享库,并且应用程序继续运行。




结果:这样工作!如果我在每个共享库中放置一个相同命名的函数,我可以在运行时验证相应的版本是基于调度库使用的条件执行的。



问题:以上工作的玩具示例,我演示了在链接的问题。具体来说,如果库只导出函数,似乎工作正常。但是,一旦存在变量(无论它们是具有C链接的全局变量还是 typeinfo ),我在运行时会得到未解决的符号错误。



下面的代码演示了问题:



libtest.h

  extern int bar; 

int foo();

libtest.cc

  #include< iostream> 

int bar = 2;

int foo()
{
std :: cout< function call from libtest< std :: endl;
return 0;
}

libtest_variant.cc

  #include< iostream> 

int bar = 1;

int foo()
{
std :: cout< 函数调用来自libtest_variant< std :: endl;
return 0;
}

libtest_dispatch.cc

  #include< dlfcn.h> 
#include< iostream>
#include< stdlib.h>

__attribute __((constructor))void init()
{
if(getenv(USE_VARIANT))dlopen(libtest_variantSHLIB_EXT,RTLD_NOW | RTLD_GLOBAL); $ b $ d else dlopen(libtestSHLIB_EXT,RTLD_NOW | RTLD_GLOBAL);
}

test.cc

  #includelib.h
#include< iostream>

int main()
{
std :: cout< bar:<< bar<< std :: endl;
foo();
}

我使用以下命令构建库和测试应用程序:

  g ++ -fPIC -shared -o libtest.so libtest.cc -Wl,-soname,libtest_dispatch.so 
g ++ -fPIC -shared -o libtest_variant.so libtest_variant
g ++ -fPIC -shared -o libtest_dispatch.so libtest_dispatch.cc -ldl
g ++ test.cc -o test -L。 -ltest -Wl,-rpath ,.然后,我尝试使用以下命令行运行测试:

b

 > ./test 
./test:符号查找错误:./test:undefined symbol:bar
> USE_VARIANT = 1 ./test
./test:符号查找错误:./test:未定义符号:bar

失败。如果我删除全局变量 bar 的所有实例,并尝试仅分发 foo()函数,那么它全部作品。我想弄清楚为什么,是否可以得到我想要的效果,在全局变量的存在。



调试:在尝试诊断问题时,我在运行测试程序时使用了 LD_DEBUG 环境变量。看起来问题归结为:


动态链接器在加载过程的早期从共享库执行全局变量的重定位,之前调用共享库的构造函数。因此,在我的分派库有机会运行其构造函数并加载实际提供这些符号的库之前,它试图找到一些全局变量符号。


这似乎是一个大的障碍。有一些方法,我可以改变这个过程,使我的调度程序可以先运行吗?



我知道我可以使用 LD_PRELOAD 预装载库。但是,这对我的软件最终运行的环境是一个繁琐的要求。如果可能,我想找到一个不同的解决方案。



进一步审查,看来即使我 LD_PRELOAD 库,我也有同样的问题。在发生全局变量符号解析之前,构造函数仍然不会被执行。预加载功能的使用只需将所需的库推入库列表的顶部。

解决方案


失败。如果我删除全局变量bar的所有实例,并尝试只分派foo()函数,那么它都可以工作。


原因这个没有全局变量的工作是,函数(默认)使用延迟绑定,但是变量不能(显而易见的原因)。



任何全局变量,如果你的测试程序链接到 -Wl,-z,现在(这将禁用惰性绑定的函数)。



您可以通过将主程序引用的每个全局变量的实例引入调度库来解决此问题。



与其他答案建议相反,



有两种标准方法。


$

b $ b

较旧的: DT_RPATH 的一部分使用 $ PLATFORM DT_RUNPATH 。内核将传递一个字符串,例如 x86_64 i386 i686 作为 aux 向量的一部分, ld.so 将替换 $ PLATFORM 与该字符串。



这允许分发的 i386 i686 - 优化的库,并且程序根据正在运行的CPU选择适当的版本。



这不是很灵活,并且(据我理解)不允许你区分各种 x86_64 变体。



新的热点是 IFUNC 调度,记录此处。这是GLIBC目前用来提供不同版本的例如。 memcpy ,具体取决于运行的CPU。还有 target target_clones 属性(记录在同一页上),允许您编译例程的多个变体,为不同的处理器优化(如果你不想在程序集中对它们进行编码)。



在这种情况下,您可以必须将二进制包装在shell脚本中,并根据CPU将 LD_LIBRARY_PATH 设置为不同的目录。或者在运行程序之前让用户 source 您的脚本。


target_clones有趣;是最近添加到gcc


我相信 IFUNC 支持是4-5岁,GCC中的自动克隆约2年。是的,很近。


Background: I'm trying to implement a system like that described in this previous answer. In short, I have an application that links against a shared library (on Linux at present). I would like that shared library to switch between multiple implementations at runtime (for instance, based on whether the host CPU supports a certain instruction set).

In its simplest case, I have three distinct shared library files:

  • libtest.so: This is the "vanilla" version of the library that will be used as a fallback case.
  • libtest_variant.so: This is the "optimized" variant of the library that I would like to select at runtime if the CPU supports it. It is ABI-compatible with libtest.so.
  • libtest_dispatch.so: This is the library that is responsible for choosing which variant of the library to use at runtime.

In keeping with the approach suggested in the linked answer above, I'm doing the following:

  • The final application is linked against libtest.so.
  • I have the DT_SONAME field of libtest.so set to libtest_dispatch.so. Therefore, when I run the application, it will load libtest_dispatch.so instead of the actual dependency libtest.so.
  • libtest_dispatch.so is configured to have a constructor function that looks like this (pseudocode):

    __attribute__((constructor)) void init()
    {
        if (can_use_variant) dlopen("libtest_variant" SHLIB_EXT, RTLD_NOW | RTLD_GLOBAL);
        else dlopen("libtest" SHLIB_EXT, RTLD_NOW | RTLD_GLOBAL);
    }
    

    The call to dlopen() will load the shared library that provides the appropriate implementation, and the application moves on.

Result: This works! If I place an identically-named function in each shared library, I can verify at runtime that the appropriate version is executed based upon the conditions used by the dispatch library.

The problem: The above works for the toy example that I demonstrated it with in the linked question. Specifically, it seems to work fine if the libraries only export functions. However, once there are variables in play (whether they be global variables with C linkage or C++ constructs like typeinfo), I get unresolved-symbol errors at runtime.

The below code demonstrates the problem:

libtest.h:

extern int bar;

int foo();

libtest.cc:

#include <iostream>

int bar = 2;

int foo()
{
    std::cout << "function call came from libtest" << std::endl;
    return 0;
}

libtest_variant.cc:

#include <iostream>

int bar = 1;

int foo()
{
    std::cout << "function call came from libtest_variant" << std::endl;
    return 0;
}

libtest_dispatch.cc:

#include <dlfcn.h>
#include <iostream>
#include <stdlib.h>

__attribute__((constructor)) void init()
{
    if (getenv("USE_VARIANT")) dlopen("libtest_variant" SHLIB_EXT, RTLD_NOW | RTLD_GLOBAL);
    else dlopen("libtest" SHLIB_EXT, RTLD_NOW | RTLD_GLOBAL);
}

test.cc:

#include "lib.h"
#include <iostream>

int main()
{
    std::cout << "bar: " << bar << std::endl;
    foo();
}

I build the libraries and test application using the following:

g++ -fPIC -shared -o libtest.so libtest.cc -Wl,-soname,libtest_dispatch.so
g++ -fPIC -shared -o libtest_variant.so libtest_variant
g++ -fPIC -shared -o libtest_dispatch.so libtest_dispatch.cc -ldl
g++ test.cc -o test -L. -ltest -Wl,-rpath,.

Then, I try to run the test using the following command lines:

> ./test
./test: symbol lookup error: ./test: undefined symbol: bar
> USE_VARIANT=1 ./test
./test: symbol lookup error: ./test: undefined symbol: bar

Failure. If I remove all instances of the global variable bar and try to dispatch the foo() function only, then it all works. I'm trying to figure out exactly why and whether I can get the effect that I want in the presence of global variables.

Debugging: In attempting to diagnose the problem, I've done some playing with the LD_DEBUG environment variable while running the test program. It seems like the problem comes down to this:

The dynamic linker performs relocations of global variables from shared libraries very early in the loading process, before constructors from shared libraries are called. Therefore, it tries to locate some global variable symbols before my dispatch library has had a chance to run its constructor and load the library that will actually provide those symbols.

This seems to be a big roadblock. Is there some way that I can alter this process so that my dispatcher can run first?

I know that I could preload the library using LD_PRELOAD. However, this is a cumbersome requirement for the environment that my software will eventually run in. I'd like to find a different solution if possible.

Upon further review, it appears that even if I LD_PRELOAD the library, I have the same problem. The constructor still doesn't get executed before the global variable symbol resolution occurs. Usage of the preload feature just pushes the desired library to the top of the library list.

解决方案

Failure. If I remove all instances of the global variable bar and try to dispatch the foo() function only, then it all works.

The reason this works without global variables is that functions (by default) use lazy binding, but variables can not (for obvious reasons).

You would get the exact same failure without any global variables if your test program is linked with -Wl,-z,now (which would disable lazy binding of functions).

You could fix this by introducing an instance of every global variable referenced by your main program into the dispatch library.

Contrary to what your other answer suggests, this is not the standard way to do CPU-specific dispatch.

There are two standard ways.

The older one: use $PLATFORM as part of DT_RPATH or DT_RUNPATH. The kernel will pass in a string, such as x86_64, or i386, or i686 as part of the aux vector, and ld.so will replace $PLATFORM with that string.

This allowed distributions to ship both i386 and i686-optimized libraries, and have a program select appropriate version depending on which CPU it was running on.

Needless to say, this isn't very flexible, and (as far as I understand) doesn't allow you to distinguish between various x86_64 variants.

The new hotness is IFUNC dispatch, documented here. This is what GLIBC currently uses to provide different versions of e.g. memcpy depending on which CPU it is running on. There is also target and target_clones attribute (documented on the same page) that allows you to compile several variants of the routine, optimized for different processors (in case you don't want to code them in assembly).

I'm trying to apply this functionality to an existing, very large library, so just a recompile is the most straightforward way of implementing it.

In that case, you may have to wrap the binary in a shell script, and set LD_LIBRARY_PATH to different directories depending on the CPU. Or have the user source your script before running the program.

target_clones does look interesting; is that a recent addition to gcc

I believe the IFUNC support is about 4-5 years old, the automatic cloning in GCC is about 2 years old. So yes, quite recent.

这篇关于我可以使共享库构造函数在重定位之前执行吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆