gcc ld:确定静态库链接顺序的方法 [英] gcc ld: method to determine link order of static libraries

查看:99
本文介绍了gcc ld:确定静态库链接顺序的方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的可执行文件与许多静态库链接,在Linux上通常有50至100个存档.有时,这些档案中存在依赖性周期.这些库在链接命令行上出现的顺序很重要,请参见

My executables are linked with many static libraries, typically between 50 and 100 archives on Linux. Occasionally there are dependency cycles in these archives. The order that these libraries appear on the link command line is significant, see here. Attempting to manually order this many libraries is time-consuming at minimum, especially when there are cycles present.

问题:是否有实用程序或技术可以分析代码库并产生正确的链接命令行顺序?

Question: is there a utility or technique that will analyze a code base and produce a correct link command line ordering?

推荐答案

您要进行拓扑排序.

tsort 程序可以做到这一点,但是您需要做更多的工作才能使用它[准备编写perl/python脚本].另外,还有另一种方法.而且,我转到下面的操作方法",因为我之前已经做过这种事情.

The tsort program will do that, but you'll need to do more work to use it [be prepared to write a perl/python script]. Also, there is another way as well. And, I will get to the "howto" below as I've done this sort of thing before.

简短答案:使用-start-group liblist -end-group 完成

出于某些原因:

一个ld组是 smart .它不只是循环文件.它会初次通过组,但会记住符号.因此,在随后的遍历中,它将使用缓存的符号表信息,因此非常快.

An ld group is smart. It doesn't just loop on the files. It makes an initial pass through the group, but remembers the symbols. So, on subsequent passes it uses the cached symbol table information, so it's very fast.

对于复杂的交互,您可能不能使用拓扑结构摆脱所有的周期,因此您将仍然需要一个组,即使已经对 liblist 进行了拓扑排序.

For complex interactions, you may not be able to get rid of all the cycles with a toposort, so you'll still need a group even if liblist has been topo sorted.

我们在谈论多少时间?而且,您认为可以节省多少时间?您将如何衡量事物以证明您确实需要它.

Just how much time are we talking about? And, how much time do you think will be saved? How will you measure things to prove you really need this.

追求黄金

考虑使用 ld.gold ,而不是使用 ld .它已经从头开始重写为 not 而不是使用libbfd [这很慢],并且可以直接在ELF文件上运行.创建它的主要动机是简单性和速度.

Instead of using ld, consider using ld.gold. It has been rewritten from scratch to not use libbfd [which is slow] and operates on ELF files directly. The primary motivation for creating it was simplicity and speed.

如何对库列表进行拓扑排序

如果我们执行 info coreutils ,则tsort部分将提供有关如何对符号表进行排列的示例.

If we do info coreutils, the tsort section will give an example of how to toposort a symbol table.

但是,在此之前,我们需要获取符号.对于 .a 文件, nm 可以提供以下列表: nm -go< liblist> .

But, before we can get to that, we'll need to get the symbols. For a .a file, nm can provide the list: nm -go <liblist>.

输出将如下所示:

libbfd.a:
libbfd.a:archive.o:0000000000000790 T _bfd_add_bfd_to_archive_cache
libbfd.a:archive.o:                 U bfd_alloc
libbfd.a:archive.o:0000000000000c20 T _bfd_append_relative_path
libbfd.a:archive.o:                 U bfd_assert
libbfd.a:archive.o:                 U bfd_bread
libbfd.a:archive.o:00000000000021b0 T _bfd_bsd44_write_ar_hdr
libbfd.a:archive.o:                 U strcpy
libbfd.a:archive.o:                 U strlen
libbfd.a:archive.o:                 U strncmp
libbfd.a:archive.o:                 U strncpy
libbfd.a:archive.o:                 U strtol
libbfd.a:archive.o:                 U xstrdup
libbfd.a:bfd.o:                 U __asprintf_chk
libbfd.a:bfd.o:00000000000002b0 T _bfd_abort
libbfd.a:bfd.o:0000000000000e40 T bfd_alt_mach_code
libbfd.a:bfd.o:                 U bfd_arch_bits_per_address
libbfd.a:bfd.o:0000000000000260 T bfd_assert
libbfd.a:bfd.o:0000000000000000 D _bfd_assert_handler
libbfd.a:bfd.o:0000000000000450 T bfd_canonicalize_reloc
libbfd.a:bfd.o:                 U bfd_coff_get_comdat_section
libbfd.a:bfd.o:0000000000000510 T _bfd_default_error_handler
libbfd.a:bfd.o:0000000000000fd0 T bfd_demangle
libbfd.a:bfd.o:                 U memcpy
libbfd.a:bfd.o:                 U strchr
libbfd.a:bfd.o:                 U strlen
libbfd.a:opncls.o:0000000000000a50 T bfd_openr
libbfd.a:opncls.o:0000000000001100 T bfd_openr_iovec
libbfd.a:opncls.o:0000000000000b10 T bfd_openstreamr
libbfd.a:opncls.o:0000000000000bb0 T bfd_openw
libbfd.a:opncls.o:0000000000001240 T bfd_release
libbfd.a:opncls.o:                 U bfd_set_section_contents
libbfd.a:opncls.o:                 U bfd_set_section_size
libbfd.a:opncls.o:0000000000000000 B bfd_use_reserved_id
libbfd.a:opncls.o:00000000000010d0 T bfd_zalloc
libbfd.a:opncls.o:00000000000011d0 T bfd_zalloc2

libglib-2.0.a:
libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000100 T g_allocator_free
libglib-2.0.a:libglib_2_0_la-gallocator.o:00000000000000f0 T g_allocator_new
libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000150 T g_blow_chunks
libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000160 T g_list_push_allocator
libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000060 T g_mem_chunk_alloc
libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000090 T g_mem_chunk_alloc0
libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000110 T g_mem_chunk_clean
libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000120 T g_mem_chunk_reset
libglib-2.0.a:libglib_2_0_la-gallocator.o:00000000000001b0 T g_node_pop_allocator
libglib-2.0.a:libglib_2_0_la-gallocator.o:00000000000001a0 T g_node_push_allocator
libglib-2.0.a:libglib_2_0_la-gallocator.o:                 U g_return_if_fail_warning
libglib-2.0.a:libglib_2_0_la-gallocator.o:                 U g_slice_alloc
libglib-2.0.a:libglib_2_0_la-gallocator.o:                 U g_slice_alloc0
libglib-2.0.a:libglib_2_0_la-gallocator.o:                 U g_slice_free1
libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000190 T g_slist_pop_allocator
libglib-2.0.a:libglib_2_0_la-gslice.o:                 U g_private_get
libglib-2.0.a:libglib_2_0_la-gslice.o:                 U g_private_set
libglib-2.0.a:libglib_2_0_la-gslice.o:                 U g_return_if_fail_warning
libglib-2.0.a:libglib_2_0_la-gslice.o:00000000000010d0 T g_slice_alloc
libglib-2.0.a:libglib_2_0_la-gslice.o:0000000000001770 T g_slice_alloc0
libglib-2.0.a:libglib_2_0_la-gslice.o:00000000000017a0 T g_slice_copy
libglib-2.0.a:libglib_2_0_la-gslice.o:00000000000017e0 T g_slice_free1
libglib-2.0.a:libglib_2_0_la-gslice.o:0000000000001ae0 T g_slice_free_chain_with_offset

因此,语法将是:

<libname.a>:<objname.o>:<address> [TDB] <symbol>
<libname.a>:<objname.o>:          U     <symbol>

,我们需要提取 libname.a ,符号 type (例如T,D,B,U)和符号.

and we'll need to extract libname.a, symbol type (e.g. T, D, B, U), and the symbol.

我们创建文件列表.在每个文件结构中,我们都记住所有符号及其类型.不是 U [未定义符号]的任何类型都将定义该符号.

We create a list of files. In each file struct, we remember all symbols and their types. Any type that is not U [undefined symbol] will define the symbol.

请注意,在构建符号表时,一个库可能有多个U(在各个.o中),这些U引用其中的另一个.o所定义的符号.因此,我们只记录一次符号,如果看到一个非U类型的符号,我们将升级"它(例如,如果我们看到 U foo ,后来又看到了 T foo ,我们将 foo 的类型更改为 T (对于D和B同样如此).

Note that as we build the symbol table, a library may have multiple U's [in various .o's] that refer to a symbol defined by another .o within it. So, we only record the symbol once and if we see a non-U type, we "promote" it (e.g. if we saw U foo and later saw T foo we change the type of foo to T [likewise for D and B].

现在我们遍历文件列表(例如 curfile ).对于文件符号表中的每个符号,如果其类型为 U [undefined],则我们扫描 all 个文件以查找非U符号定义.如果我们找到一个(在 symfile (例如)中),我们可以为tsort输出一个依赖行:< curfile>< symfile> .我们对所有文件和符号重复此操作.

Now we traverse the file list (e.g. curfile). For each symbol in the file's symbol table, if it's of type U [undefined], we scan all files looking for a non-U symbol definition. If we find one (in symfile (e.g.)), we can output a dependency line for tsort: <curfile> <symfile>. We repeat this for all files and symbols.

请注意,这有点浪费,因为我们可以输出许多相同的 file 依赖行,因为上面的代码将为每个 symbol 生成一行.因此,我们应该跟踪输出的行,并且仅输出唯一文件对的依赖行.另外,请注意, 可能同时具有 foo bar bar foo .也就是说,实际上是一个周期.虽然我们只想要 foo bar 和/或 bar foo 的一个副本,但它们应该互相排斥.

Note that this is bit wasteful because we could output many file dependency lines that are identical because the above will generate a line for each symbol. So, we should keep track of the lines output and only output a dependency line for unique file pairs. Also, note, it is possible to have both foo bar and bar foo. That is, actually, a cycle. While we just want one copy of foo bar and/or bar foo, they should not exclude one another.

好吧,所以现在将以上内容输出到 tsort ,它将为我们提供我们想要的 liblist 拓扑排序版本

Okay, so now feed the output of the above to tsort and it will give us the topologically sorted version of liblist that we want.

很明显,脚本解析可能要花费一些时间,因此tsort输出应该基于 liblist


As should be obvious, the script parsing can take some time, so the tsort output should be cached in a file, and rebuilt in a makefile, based upon a dependency list of liblist

将一些.a文件转换为.o文件

如果给定的库使用了其[.o]文件的全部[或大部分],而不是执行 ar rv libname.a ... ,请考虑使用 ld -r libname.o... .

If a given library uses all [or most] of the its .o files, instead of doing ar rv libname.a ..., consider doing ld -r libname.o ....

这与创建共享库.so文件的方法类似,但是"big" .o仍可以静态链接.

This is similar in approach to creating a shared library .so file, but the "big" .o can still be statically linked.

现在,您拥有一个比.a链接速度更快的单个.o,因为库内链接已经解决.而且,它将对依赖性周期有所帮助.

Now, you have a single .o that will link faster than the .a because the intra-library links have already been resolved. Also, it will help with dependency cycles a bit.

稍微扩展一下topo脚本可以告诉您哪些库是此的最佳选择.

A slight extension to the topo script could tell you which libraries are good candidates for this.

即使不能更改常规的构建makefile,最终"顶层也可以使用.a,将其提取到.o,或者使用带有-r的ld force load选项来获取"big".o

Even if the normal build makefiles can't be changed, the "final" top level could take a .a, either extract it into .o's, or use an ld force load option with -r to get the "big" .o

这篇关于gcc ld:确定静态库链接顺序的方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆