未解析弱函数的 GCC 行为 [英] GCC behavior for unresolved weak functions
问题描述
考虑下面的简单程序:
__attribute__((weak)) void weakf(void);int main(int argc, char *argv[]){弱();}
当用 gcc 编译它并在 Linux PC 上运行它时,它会出现段错误.在 ARM CM0 (arm-none-eabi-gcc) 上运行时,链接器通过跳转到以下指令和 nop 替换未定义符号.
这种行为记录在哪里?是否有可能通过命令行选项更改它?我经历过 GCC 和 LD 文档,没有相关信息.
如果我查看 ARM 编译器文档,这个解释清楚.
man nm
我正在阅读一些文档,碰巧遇到了一个相关的引用:
man nm
说:
<块引用>V"
v"符号是弱对象.当弱定义符号与正常定义符号链接时,正常定义符号的使用不会出错.当一个弱未定义符号被链接并且符号未定义,弱符号的值变为零且没有错误.在某些系统上,大写表示已指定默认值.
W"
"w" 符号是一个弱符号,没有被专门标记为弱对象符号.当弱定义符号与正常定义符号链接时,正常定义符号是使用没有错误.当弱未定义符号被链接且符号未定义时,符号的值以特定于系统的方式确定而不会出错.在某些系统上,大写表示已指定默认值.
nm
是 Binutils 的一部分,GCC 在后台使用它,所以这应该足够规范.
然后,以您的源文件为例:
main.c
__attribute__((weak)) void weakf(void);int main(int argc, char *argv[]){弱();}
我们这样做:
gcc -O0 -ggdb3 -std=c99 -Wall -Wextra -pedantic -o main.out main.c纳米主输出
其中包含:
wweakf
所以它是一个系统特定的值.但是,我找不到每个系统行为的定义位置.我认为您没有比在此处阅读 Binutils 源代码做得更好的了.
v
将固定为 0,但用于未定义的变量(即对象):如何使用 GCC 进行弱链接?
那么:
gdb -batch -ex 'disassemble/rs main' main.out
给出:
转储函数 main 的汇编代码:主文件:4 {0x0000000000001135 <+0>: 55 推%rbp0x0000000000001136 <+1>: 48 89 e5 mov %rsp,%rbp0x0000000000001139 <+4>: 48 83 ec 10 sub $0x10,%rsp0x000000000000113d <+8>: 89 7d fc mov %edi,-0x4(%rbp)0x0000000000001140 <+11>: 48 89 75 f0 mov %rsi,-0x10(%rbp)5 弱();0x0000000000001144 <+15>: e8 e7 fe ff ff callq 0x1030 <weakf@plt>0x0000000000001149 <+20>: b8 00 00 00 00 mov $0x0,%eax6 }0x000000000000114e <+25>: c9 leaveq0x000000000000114f <+26>: c3 retq汇编程序转储结束.
这意味着它在 PLT 得到解决.>
然后由于我不完全理解 PLT,我通过实验验证它解析为地址 0 和段错误:
gdb -nh -ex run -ex bt main.out
我假设在 ARM 上也会发生同样的情况,它也必须将其设置为 0.
Consider the simple program below:
__attribute__((weak)) void weakf(void);
int main(int argc, char *argv[])
{
weakf();
}
When compiling this with gcc and running it on a Linux PC, it segfaults. When running it on ARM CM0 (arm-none-eabi-gcc), the linker replace the undefined symbol by a jump to the following instruction and a nop.
Where is this behavior documented? Is there possible ways to change it through command line options? I have been through GCC and LD documentations, there is no information about that.
If I check the ARM compiler doc however, this is clearly explained.
man nm
I was reading some docs and happened to come across a related quote for this:
man nm
says:
"V"
"v" The symbol is a weak object. When a weak defined symbol is linked with a normal defined symbol, the normal defined symbol is used with no error. When a weak undefined symbol is linked and the symbol is not defined, the value of the weak symbol becomes zero with no error. On some systems, uppercase indicates that a default value has been specified."W"
"w" The symbol is a weak symbol that has not been specifically tagged as a weak object symbol. When a weak defined symbol is linked with a normal defined symbol, the normal defined symbol is used with no error. When a weak undefined symbol is linked and the symbol is not defined, the value of the symbol is determined in a system-specific manner without error. On some systems, uppercase indicates that a default value has been specified.
nm
is part of Binutils, which GCC uses under the hood, so this should be canonical enough.
Then, example on your source file:
main.c
__attribute__((weak)) void weakf(void);
int main(int argc, char *argv[])
{
weakf();
}
we do:
gcc -O0 -ggdb3 -std=c99 -Wall -Wextra -pedantic -o main.out main.c
nm main.out
which contains:
w weakf
and so it is a system-specific value. I can't find where the per-system behavior is defined however. I don't think you can do better than reading Binutils source here.
v
would be fixed to 0, but that is used for undefined variables (which are objects): How to make weak linking work with GCC?
Then:
gdb -batch -ex 'disassemble/rs main' main.out
gives:
Dump of assembler code for function main:
main.c:
4 {
0x0000000000001135 <+0>: 55 push %rbp
0x0000000000001136 <+1>: 48 89 e5 mov %rsp,%rbp
0x0000000000001139 <+4>: 48 83 ec 10 sub $0x10,%rsp
0x000000000000113d <+8>: 89 7d fc mov %edi,-0x4(%rbp)
0x0000000000001140 <+11>: 48 89 75 f0 mov %rsi,-0x10(%rbp)
5 weakf();
0x0000000000001144 <+15>: e8 e7 fe ff ff callq 0x1030 <weakf@plt>
0x0000000000001149 <+20>: b8 00 00 00 00 mov $0x0,%eax
6 }
0x000000000000114e <+25>: c9 leaveq
0x000000000000114f <+26>: c3 retq
End of assembler dump.
which means it gets resolved at the PLT.
Then since I don't fully understand PLT, I experimentally verify that it resolves to address 0 and segfaults:
gdb -nh -ex run -ex bt main.out
I'm supposing the same happens on ARM, it must just set it to 0 as well.
这篇关于未解析弱函数的 GCC 行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!