在gcc 32位代码中对琐碎功能(独立的OS)的未定义引用,对_GLOBAL_OFFSET_TABLE_的引用 [英] undefined reference to `_GLOBAL_OFFSET_TABLE_' in gcc 32-bit code for a trivial function, freestanding OS
问题描述
我有一个小的C代码文件(function.c):
I have a small c code file(function.c):
int function()
{
return 0x1234abce;
}
我正在使用64位计算机.但是,我想编写一个小的32位OS.我想将代码编译为纯"程序集/二进制文件.
I am using a 64 bit machine. However, I want to write a small 32 bit OS. I want to compile the code into a 'pure' assembly/binary file.
我使用以下代码编译代码:
I compile my code with:
gcc function.c -c -m32 -o file.o -ffreestanding # This gives you the object file
我将其链接为:
ld -o function.bin -m elf_i386 -Ttext 0x0 --oformat binary function.o
我遇到以下错误:
function.o: In function `function':
function.c:(.text+0x9): undefined reference to `_GLOBAL_OFFSET_TABLE_'
推荐答案
您需要-fno-pie
; 默认值(在大多数现代发行版中)是-fpie
:生成与位置无关的可执行文件的代码.这是一个与-pie
链接器选项(gcc在默认情况下也会传递)分开的代码生成选项,并且独立于-ffreestanding
. -fpie -ffreestanding
表示您想要一个使用GOT的独立式PIE,这就是GCC的目标.
You need -fno-pie
; the default (in most modern distros) is -fpie
: generate code for a position-independent executable. This is a code-gen option separate from the -pie
linker option (which gcc also passes by default), and is independent of -ffreestanding
. -fpie -ffreestanding
implies you want a freestanding PIE that uses a GOT, so that's what GCC targets.
-fpie
在64位代码(可以进行RIP相对寻址)中仅花费一点速度,但是对于32位代码则非常不利;编译器在一个整数寄存器中获取一个指向GOT的指针(将8个寄存器中的另一个绑定),并使用[reg + disp32]
寻址方式(如[eax + foo@GOTOFF]
-fpie
only costs a bit of speed in 64-bit code (where RIP-relative addressing is possible) but is quite bad for 32-bit code; compilers get a pointer to the GOT in one of the integer registers (tying up another one of the 8) and access static data relative to that address with [reg + disp32]
addressing modes like [eax + foo@GOTOFF]
在禁用优化的情况下,即使函数不访问任何静态数据,gcc -fpie -m32
也会在寄存器中生成GOT的地址.如果查看编译器输出(在要编译的计算机上使用gcc -S
而不是-c
,则可以看到此信息.)
With optimization disabled, gcc -fpie -m32
generates the address of the GOT in a register even though the function doesn't access any static data. You'd can see this if you look at your compiler output (with gcc -S
instead of -c
on the machine you're compiling on).
On Godbolt we can use -m32 -fpie
to give the same effect as a GCC configured with --enable-default-pie
:
# gcc9.2 -O0 -m32 -fpie
function():
push ebp
mov ebp, esp # frame pointer
call __x86.get_pc_thunk.ax
add eax, OFFSET FLAT:_GLOBAL_OFFSET_TABLE_ # EAX points to the GOT
mov eax, 305441742 # overwrite with the return value
pop ebp
ret
__x86.get_pc_thunk.ax: # this is the helper function gcc calls
mov eax, DWORD PTR [esp]
ret
"thunk"返回其返回地址.即call
之后的指令地址. .ax
名称表示以EAX返回.现代的GCC可以选择任何寄存器;传统上,32位PIC基址寄存器始终是EBX,但是现代GCC在避免避免额外保存/恢复EBX的情况下选择了呼叫密集型寄存器.
The "thunk" returns its return address. i.e. the address of the instruction after the call
. The .ax
name means to return in EAX. Modern GCC can choose any register; traditionally the 32-bit PIC base register was always EBX but modern GCC chooses a call-clobbered register when that avoids an extra save/restore of EBX.
有趣的事实:call +0; pop eax
会更有效,并且在每个呼叫站点仅大1个字节.您可能会认为这会使返回地址预测变量堆栈失衡,但实际上call +0
在大多数CPU上都是特殊情况,因此无法做到这一点. http://blog.stuffedcow.net/2018/04/ras-microbenchmarks/#call0 . (call +0
表示rel32 = 0,因此它会调用下一条指令.不过,NASM不会这样解释该语法.)
Fun fact: call +0; pop eax
would be more efficient, and only 1 byte larger at each call site. You might think that would unbalance the return-address predictor stack, but in fact call +0
is special-cased on most CPUs to not do that. http://blog.stuffedcow.net/2018/04/ras-microbenchmarks/#call0. (call +0
means the rel32 = 0, so it calls the next instruction. That's not how NASM would interpret that syntax, though.)
clang,否则它不会生成GOT指针,即使在-O0
处也是如此.但这是通过call +0
; pop %eax
做到的: https://godbolt.org/z/GFY9Ht
clang doesn't generate a GOT pointer unless it needs one, even at -O0
. But it does so with call +0
;pop %eax
: https://godbolt.org/z/GFY9Ht
这篇关于在gcc 32位代码中对琐碎功能(独立的OS)的未定义引用,对_GLOBAL_OFFSET_TABLE_的引用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!