c ++ 17,lto,-static-libstdc ++问题:警告:重定位是指使用ld.gold丢弃的节,然后是__run_exit_handlers中的segfault [英] c++17, lto, -static-libstdc++ issue: Warning: relocation refers to discarded section with ld.gold, then segfault in __run_exit_handlers

查看:319
本文介绍了c ++ 17,lto,-static-libstdc ++问题:警告:重定位是指使用ld.gold丢弃的节,然后是__run_exit_handlers中的segfault的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻求有关如何调试一个重大问题的建议,我无法将其简化为一个最小的例子.

问题:我编译了链接到许多不同库的应用程序.标志包括: -static-libstdc ++ -static-libgcc -pipe -std = c ++ 1z -fno-PIC -flto = 10 -m64 -O3 -flto = 10 -fuse-linker-plugin -fuse-ld = gold -UNDEBUG-lrt -ldl

编译器为gcc-7.3.0,针对binutils-2.30进行了编译.Boost与程序的其余部分具有相同的标志进行编译,并进行静态链接.

当程序被链接时,在我自己的代码中和在boost中,我都会收到有关重定位到废弃部分的各种警告.例如:

 /tmp/ccq2Ddku.ltrans13.ltrans.o:<artificial>:function boost :: system ::((anonymous namespace)):: generic_error_category :: message(int)const:警告:重定位是指丢弃部分 

然后,当我运行该程序时,它会在回溯时对破坏进行分段:

 程序收到信号SIGSEGV,分段错误.0x0000000000000000 in ??()(gdb)bt#0 0x0000000000000000 in ??()来自/lib64/libc.so.6的__run_exit_handlers()中的#1 0x00007ffff7345a49#2 0x00007ffff7345a95在/lib64/libc.so.6的出口()中/lib64/libc.so.6中的__libc_start_main()中的#3 0x00007ffff732eb3c#4 0x000000000049b3e3 in _start() 

试图被调用的函数指针是0x0.

如果我使用static-libstdc ++删除,链接器警告和运行时段错误就会消失.

如果我从c ++ 1z更改为c ++ 14,则链接程序警告和运行时段错误会消失.

如果删除-flto,链接器警告和运行时段错误将消失.

如果在编译标志中添加"-g",链接器警告和运行时段错误就会消失.

我曾尝试通过指定-Wl​​,-debug = all来要求黄金进行额外的调试,但这似乎告诉我没有任何意义.

如果我尝试使用一小部分似乎相关的代码,分别编译并链接到同一升压库(即尝试生成最少的示例),则没有链接器警告,并且程序运行到完成而没有问题.

帮助!我该怎么办才能缩小问题范围?

解决方案

此警告通常表示两个编译单元之间的COMDAT组的内容不一致.如果编译器发出在一个编译单元中定义了符号A的COMDAT组G,但又发出了在第二个编译单元中定义了符号A和B的同一组G,则链接器将保留第一个编译单元中的组G并丢弃组G.从第二个.在第二个编译单元中从组外部对符号B的任何引用都会产生此错误.

原因通常是编译器中的错误,而使用-flto使其更难诊断.在这种情况下,您的第二个编译单元是链接时优化的结果(* .ltrans.o文件名).使用LTO,您已经提到的许多更改都可以使问题消失,这是可以相信的.

binutils git repo的master分支上最新版的gold具有新的 [-Wl,]-debug = plugin 选项,该选项将保存日志和所有临时文件.ltrans.o文件.拥有日志和那些文件以及所有原始输入文件(可以通过添加 [-Wl,]-t 选项获得列表)将有助于更好地隔离问题./p>

最新版本的黄金还将打印搬迁所引用的符号.对于本地符号,它将显示符号索引.使用 readelf -s 获取有关该符号的更多信息.对于全局符号,它将显示名称.您可以为确切名称添加-no-demangle 选项.

如果是本地符号,则问题几乎可以肯定是编译器.严格禁止从comdat组外部引用该组中的本地符号.

如果它是一个全局符号,则可能是源中的编译器问题或违反一个定义规则(ODR)的问题.您需要在命名的目标文件中标识comdat组,找到其关键符号,然后找到提供了链接程序保留的定义的目标文件(-y选项将有帮助),并比较这些组中定义的符号由两个对象组成.这些步骤应该有帮助:

(1)从错误消息开始:

  b.o(.data + 0x0):警告:重定位是指舍弃部分中定义的符号两个" 

(2)在b.o中寻找符号二":

  $ readelf -sW b.o |grep二7:0000000000000008 0 NOTYPE弱默认6个 

倒数第二个字段("6")是定义两个"的节号.

(3)确认第6节实际上是一个司空见惯的团体:

  $ readelf -SW b.o[Nr]名称类型地址最小尺寸ES Flg Lk Inf Al[6] .PROGBITS 0000000000000000 000058 000018 00 WAG 0 0 1 

sh_flags字段("Flg")中的"G"表示该部分属于Comdat组.

(4)查找包含以下部分的comdat组:

  $ readelf -g b.oCOMDAT组节[1]`.group'[one]包含1个节:[索引]名称[6]. 

这表明我们第6节是组第1节的成员.

(5)找到该组的钥匙符号:

  $ readelf -SW b.o[Nr]名称类型地址最小尺寸ES Flg Lk Inf Al[1] .group GROUP 0000000000000000 000040 000008 04 7 8 4 

sh_info字段("Inf")告诉我们关键符号是符号#8,即"1".(该名称应与步骤4中括号中显示的名称匹配.)

  $ readelf -sW b.o数值:值大小类型Bind Vis Ndx名称8:0000000000000000 0 NOTYPE弱默认6 1 

(6)现在,您可以在链接中添加 -y one 选项,以查找哪些对象提供了一个"的定义:

  $ gcc -Wl,-y,one ...a.o:一个的定义b.o:一个的定义 

第一个列出的(a.o)是黄金保存的;它将丢弃所有具有相同键符号的后续Comdat组.

如果您使用相同的技术检查在ao中定义一个"的comdat组,并将属于该组的符号与属于bo中的组的符号进行比较,那么应该会提供更多线索.

>

I am after some suggestions as to how to go about debugging a significant problem that I cannot reduce to a minimal example.

The problem: I compile my application which links to a number of different libraries. The flags include: -static-libstdc++ -static-libgcc -pipe -std=c++1z -fno-PIC -flto=10 -m64 -O3 -flto=10 -fuse-linker-plugin -fuse-ld=gold -UNDEBUG -lrt -ldl

The compiler is gcc-7.3.0, compiled against binutils-2.30. Boost is compiled with the same flags as the rest of the program, and linked statically.

When the program is linked, I get various warnings about relocation refers to discarded section, both in my own code, and in boost. For instance:

/tmp/ccq2Ddku.ltrans13.ltrans.o:<artificial>:function boost::system::(anonymous namespace)::generic_error_category::message(int) const: warning: relocation refers to discarded section

Then when I run the program, it segfaults on destruction, with the backtrace:

Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007ffff7345a49 in __run_exit_handlers () from /lib64/libc.so.6
#2  0x00007ffff7345a95 in exit () from /lib64/libc.so.6
#3  0x00007ffff732eb3c in __libc_start_main () from /lib64/libc.so.6
#4  0x000000000049b3e3 in _start ()

The function pointer attempting to be called is 0x0.

If I remove using static-libstdc++, the linker warnings and runtime segfault go away.

If I change from c++1z to c++14, the linker warnings and runtime segfault go away.

If I remove -flto, the linker warnings and runtime segfault go away.

If I add "-g" to the compile flags, the linker warnings and runtime segfault go away.

I have tried asking gold for extra debugging, by specifying -Wl,--debug=all, but it tells me seemingly nothing relevant.

If I try and use a small section of the code that appears relevant, compile and link it separately but to the same boost libraries (ie. attempting to produce minimal example), there are no linker warnings, and the program runs to completion without issues.

Help! What can I do to narrow the problem down?

解决方案

This warning is usually indicative of an inconsistency in the contents of a COMDAT group between two compilation units. If the compiler emits a COMDAT group G with symbol A defined in one compilation unit, but emits the same group G with symbols A and B defined in a second compilation unit, the linker will keep group G from the first compilation unit and discard group G from the second. Any references to symbol B from outside the group in the second compilation unit will produce this error.

The cause is usually a bug in the compiler, and using -flto makes it that much harder to diagnose. In this case, your second compilation unit is the result of link-time optimization (the *.ltrans.o file name). With LTO, it's quite believable that many of the changes you've mentioned will make the problem go away.

The very latest version of gold on the master branch of the binutils git repo has a new [-Wl,]--debug=plugin option, which will save a log and all the temporary .ltrans.o files. Having the log and those files, along with all the original input files (which you can get a list of by adding the [-Wl,]-t option), should help isolate the problem better.

The latest version of gold will also print the symbol referenced by the relocation. For a local symbol, it will show the symbol index; use readelf -s to get more info about the symbol. For a global symbol, it will show the name; you can add the --no-demangle option for the exact name.

If it's a local symbol, the problem is almost certainly the compiler. References from outside a comdat group to a local symbol in the group are strictly forbidden.

If it's a global symbol, it could be either a compiler problem or a one-definition rule (ODR) violation in your sources. You'll need to identify the comdat group in the named object file, find its key symbol, then find the object file that provided the definition kept by the linker (the -y option will help), and compare the symbols defined in those groups by the two objects. These steps should help:

(1) Starting from the error message:

b.o(.data+0x0): warning: relocation refers to symbol "two" defined in discarded section

(2) Look for symbol "two" in b.o:

$ readelf -sW b.o | grep two
     7: 0000000000000008     0 NOTYPE  WEAK   DEFAULT    6 two

The next-to-last field ("6") is the section number where "two" is defined.

(3) Verify that section 6 is in fact a comdat group:

$ readelf -SW b.o
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 6] .one              PROGBITS        0000000000000000 000058 000018 00 WAG  0   0  1

The "G" in the sh_flags field ("Flg") indicates the section belongs to a comdat group.

(4) Find the comdat group containing the section:

$ readelf -g b.o
COMDAT group section [    1] `.group' [one] contains 1 sections:
   [Index]    Name
   [    6]   .one

This shows us that section 6 is a member of group section 1.

(5) Find the key symbol for that group:

$ readelf -SW b.o
      [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
      [ 1] .group            GROUP           0000000000000000 000040 000008 04      7   8  4

The sh_info field ("Inf") tells us the key symbol is symbol #8, which is "one". (That should match the name shown in brackets in step 4.)

$ readelf -sW b.o
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     8: 0000000000000000     0 NOTYPE  WEAK   DEFAULT    6 one

(6) Now you can add the -y one option to your link to find which objects provided a definition of "one":

$ gcc -Wl,-y,one ...
a.o: definition of one
b.o: definition of one

The first one listed (a.o) is the one that gold keeps; it will discard all subsequent comdat groups with the same key symbol.

If you use the same techniques to examine the comdat group that defines "one" in a.o, and compare the symbols that belong to that group with those that belong to the group in b.o, that should give you more clues.

这篇关于c ++ 17,lto,-static-libstdc ++问题:警告:重定位是指使用ld.gold丢弃的节,然后是__run_exit_handlers中的segfault的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆