为什么ELF入口点0x8048000不能通过"ld -e"更改?选项? [英] Why is the ELF entry point 0x8048000 not changeable with the "ld -e" option?
问题描述
跟进为什么ELF执行入口点虚拟地址的格式为0x80xxxxx而不是零0x0?和为什么要为linux二进制文件从0x8048000开始?,为什么我不能让ld
使用与ld -e
的默认值不同的入口点?
如果这样做,我将得到一个返回码为139的segmentation fault
,即使对于默认入口点附近的地址也是如此.为什么?
我将使问题更具体:
.text
.globl _start
_start:
movl $0x4,%eax # eax = code for 'write' system call
movl $1,%ebx # ebx = file descriptor to standard output
movl $message,%ecx # ecx = pointer to the message
movl $13,%edx # edx = length of the message
int $0x80 # make the system call
movl $0x0,%ebx # the status returned by 'exit'
movl $0x1,%eax # eax = code for 'exit' system call
int $0x80 # make the system call
.data
.globl message
message:
.string "Hello world\n" # The message as data
如果我使用as program.s -o program.o
进行编译,然后将其与ld -N program.o -o program
静态链接,则readelf -l program
将0x0000000000400078
显示为文本段的VirtAddr
,并将0x400078
显示为入口点.运行时,将打印出"Hello world".
但是,当我尝试与ld -N -e0x400082 -Ttext=0x400082 program.o -o program
链接(将文本段和入口点移动4个字节)时,程序将为killed
.现在,使用readelf -l
检查它会显示两个不同的LOAD
类型标头,一个在0x0000000000400082
上,一个在0x00000000004000b0
上.
当我尝试0x400086
时,所有功能均有效,并且只有一个LOAD
部分.
- 这是怎么回事?
- 我可以选择哪个内存地址,我不能选择哪些地址,为什么?
谢谢.
为什么我不能让ld使用与ld -e的默认入口不同的入口点
您肯定可以.这个:
int foo(int argc, char *argv[]) { return 0; }
gcc main.c -Wl,-e,foo
不起作用,因为执行不是从main开始的.它始于_start
,它是从crt0.o
(glibc的一部分)链接的,并安排了诸如动态链接之类的东西来正确启动.通过将_start
重定向到foo
,您已经绕过了所有需要进行glibc初始化的操作,因此一切都无法正常进行.
但是,如果您不需要动态链接,并且愿意做glibc通常为您做的事情,那么您可以为入口点命名任何您想要的名称.示例:
#include <syscall.h>
int foo()
{
syscall(SYS_write, 1, "Hello, world\n", 13);
syscall(SYS_exit, 0);
}
gcc t.c -static -nostartfiles -Wl,-e,foo && ./a.out
Hello, world
哦,您对这个问题的标题与您的实际问题(坏主意(TM))不符.
要回答标题中的问题,请确保可以更改可执行文件的链接地址.默认情况下,您获得0x8048000
加载地址(仅32位; 64位默认值为0x400000
).
您可以轻松地将其更改为例如通过将-Wl,-Ttext-segment=0x80000
添加到链接行中来0x80000
.
更新:
但是,当我尝试与ld -N -e0x400082 -Ttext = 0x400082 program.o -o程序链接时(将文本段和入口点移动4个字节),该程序将被杀死.
好吧,在不违反.text
部分对齐约束(为4)的情况下,不可能将Ttext
分配给0x400082
.您必须保持.text地址在至少4个字节的边界上对齐(或更改所需的.text
对齐).
当我将起始地址设置为0x400078、0x40007c,0x400080、0x400084,...,0x400098并使用GNU-ld 2.20.1时,程序可以正常工作.
但是,当我使用binutils的当前CVS快照时,该程序适用于0x400078、0x40007c,0x400088、0x40008c,并因0x400080、0x400084、0x400090、0x400094、0x400098而被杀死.这可能是链接器中的错误,或者我违反了其他一些约束(尽管我看不到).
这时,如果您真的很感兴趣,我建议下载binutils源代码,构建ld
,并弄清楚是什么原因导致它创建了两个PT_LOAD
段而不是一个. /p>
更新2:
为具有重叠LMA的部分强制使用新细分.
啊!这只是意味着您需要移开.data
.这使工作可执行文件:
ld -N -o t t.o -e0x400080 -Ttext=0x400080 -Tdata=0x400180
Following up Why is the ELF execution entry point virtual address of the form 0x80xxxxx and not zero 0x0? and Why do virtual memory addresses for linux binaries start at 0x8048000?, why cannot I make ld
use a different entry point than the default with ld -e
?
If I do so, I either get a segmentation fault
with return code 139, even for addresses close by the default entry point. Why?
EDIT:
I will make the question more specific:
.text
.globl _start
_start:
movl $0x4,%eax # eax = code for 'write' system call
movl $1,%ebx # ebx = file descriptor to standard output
movl $message,%ecx # ecx = pointer to the message
movl $13,%edx # edx = length of the message
int $0x80 # make the system call
movl $0x0,%ebx # the status returned by 'exit'
movl $0x1,%eax # eax = code for 'exit' system call
int $0x80 # make the system call
.data
.globl message
message:
.string "Hello world\n" # The message as data
If I compile this with as program.s -o program.o
and then link it statically with ld -N program.o -o program
, readelf -l program
shows 0x0000000000400078
as the VirtAddr
of the text segment and 0x400078
as entry point. When run, `Hello world" is printed.
However, when I try to link with ld -N -e0x400082 -Ttext=0x400082 program.o -o program
(moving text segment and entry point by 4 bytes), the program will be killed
. Inspecting it with readelf -l
now shows two different headers of type LOAD
, one at 0x0000000000400082
and one at 0x00000000004000b0
.
When I try 0x400086
, it all works, and there is only one LOAD
section.
- What's going on here?
- Which memory addresses may I chose, which ones cannot I chose and why?
Thanks you.
why cannot I make ld use a different entry point than the default with ld -e
You sure can. This:
int foo(int argc, char *argv[]) { return 0; }
gcc main.c -Wl,-e,foo
wouldn't work, because the execution doesn't start at main. It starts at _start
, which is linked from crt0.o
(part of glibc) and arranges for things like dynamic linking, etc. to start up properly. By redirecting _start
to foo
, you've bypassed all that required glibc initialization, and so things don't work.
But if you don't need dynamic linking, and are willing to do what glibc normally does for you, then you can name the entry point whatever you want. Example:
#include <syscall.h>
int foo()
{
syscall(SYS_write, 1, "Hello, world\n", 13);
syscall(SYS_exit, 0);
}
gcc t.c -static -nostartfiles -Wl,-e,foo && ./a.out
Hello, world
Oh, and your title of this question doesn't match your actual question (bad idea(TM)).
To answer the question in the title, you sure can change the address your executable is linked at. By default, you get 0x8048000
load address (only in 32-bits; 64-bit default is 0x400000
).
You can easily change that to e.g. 0x80000
by adding -Wl,-Ttext-segment=0x80000
to the link line.
Update:
However, when I try to link with ld -N -e0x400082 -Ttext=0x400082 program.o -o program (moving text segment and entry point by 4 bytes), the program will be killed.
Well, it is impossible to assign Ttext
to 0x400082
without violating .text
section alignment constraint (which is 4). You must keep the .text address aligned on at least 4-byte boundary (or change the required alignment of .text
).
When I set the start address to 0x400078, 0x40007c, 0x400080, 0x400084, ..., 0x400098 and use GNU-ld 2.20.1, the program works.
However, when I use current CVS snapshot of binutils, the program works for 0x400078, 0x40007c, 0x400088, 0x40008c, and gets Killed for 0x400080, 0x400084, 0x400090, 0x400094, 0x400098. This might be a bug in the linker, or I am violating some other constraint (I don't see which though).
At this point, if you are really interested, I suggest downloading binutils sources, building ld
, and figuring out what exactly causes it to create two PT_LOAD
segments instead of one.
Update 2:
Force new segment for sections with overlapping LMAs.
Ah! That just means you need to move .data
out of the way. This makes a working executable:
ld -N -o t t.o -e0x400080 -Ttext=0x400080 -Tdata=0x400180
这篇关于为什么ELF入口点0x8048000不能通过"ld -e"更改?选项?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!