为什么 ELF 可执行文件可以有 4 个 LOAD 段? [英] Why an ELF executable could have 4 LOAD segments?
问题描述
有一个远程 64 位 *nix 服务器可以编译用户提供的代码(应该用 Rust 编写,但我认为这并不重要,因为它使用 LLVM).我不知道它使用哪个编译器/链接器标志,但编译后的 ELF 可执行文件看起来很奇怪——它有 4 个 LOAD 段:
There is a remote 64-bit *nix server that can compile a user-provided code (which should be written in Rust, but I don't think it matters since it uses LLVM). I don't know which compiler/linker flags it uses, but the compiled ELF executable looks weird - it has 4 LOAD segments:
$ readelf -e executable
...
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
...
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000004138 0x0000000000004138 R 0x1000
LOAD 0x0000000000005000 0x0000000000005000 0x0000000000005000
0x00000000000305e9 0x00000000000305e9 R E 0x1000
LOAD 0x0000000000036000 0x0000000000036000 0x0000000000036000
0x000000000000d808 0x000000000000d808 R 0x1000
LOAD 0x0000000000043da0 0x0000000000044da0 0x0000000000044da0
0x0000000000002290 0x00000000000024a0 RW 0x1000
...
在我自己的系统上,我正在查看的所有可执行文件只有 2 个 LOAD 段:
On my own system all executables that I was looking at only have 2 LOAD segments:
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
...
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x00000000003000c0 0x00000000003000c0 R E 0x200000
LOAD 0x00000000003002b0 0x00000000005002b0 0x00000000005002b0
0x00000000000776c8 0x000000000009b200 RW 0x200000
...
- 在什么情况下(编译器/链接器版本、标志等)编译器可能会构建具有 4 个 LOAD 段的 ELF?
- 有 4 个 LOAD 段有什么意义?我想拥有一个具有读取但没有执行权限的段可能有助于抵御某些漏洞,但为什么要有两个这样的段呢?
推荐答案
典型的 BFD-ld 或 Gold 链接 Linux 可执行文件有 2 个可加载段,其中 ELF
标头与 .text 合并
和.rodata
进入第一个RE
段,和.data
、.bss
等可写部分合并到第二个 RW
段中.
A typical BFD-ld or Gold linked Linux executable has 2 loadable segments, with the ELF
header merged with .text
and .rodata
into the first RE
segment, and .data
, .bss
and other writable sections merged into the second RW
segment.
以下是分段映射的典型部分:
Here is the typical section to segment mapping:
$ echo "int foo; int main() { return 0;}" | clang -xc - -o a.out-gold -fuse-ld=gold
$ readelf -Wl a.out-gold
Elf file type is EXEC (Executable file)
Entry point 0x400420
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x0001f8 0x0001f8 R 0x8
INTERP 0x000238 0x0000000000400238 0x0000000000400238 0x00001c 0x00001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0006b0 0x0006b0 R E 0x1000
LOAD 0x000e18 0x0000000000401e18 0x0000000000401e18 0x0001f8 0x000200 RW 0x1000
DYNAMIC 0x000e28 0x0000000000401e28 0x0000000000401e28 0x0001b0 0x0001b0 RW 0x8
NOTE 0x000254 0x0000000000400254 0x0000000000400254 0x000020 0x000020 R 0x4
GNU_EH_FRAME 0x00067c 0x000000000040067c 0x000000000040067c 0x000034 0x000034 R 0x4
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
GNU_RELRO 0x000e18 0x0000000000401e18 0x0000000000401e18 0x0001e8 0x0001e8 RW 0x8
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .dynsym .dynstr .gnu.hash .hash .gnu.version .gnu.version_r .rela.dyn .init .text .fini .rodata .eh_frame .eh_frame_hdr
03 .fini_array .init_array .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag
06 .eh_frame_hdr
07
08 .fini_array .init_array .dynamic .got .got.plt
这优化了内核加载此类可执行文件所必须执行的 mmap
的数量,但要付出安全代价:.rodata
中的数据不应该是可执行的, 但是是(因为它与 .text
合并,它必须是可执行的).这可能会显着增加试图劫持进程的人的攻击面.
This optimizes the number of mmap
s that the kernel must perform to load such executable, but at a security cost: the data in .rodata
shouldn't be executable, but is (because it's merged with .text
, which must be executable). This may significantly increase the attack surface for someone trying to hijack a process.
较新的 Linux 系统,特别是使用 LLD
链接二进制文件,优先考虑安全性而不是速度,并将 ELF 标头和 .rodata
放入第一个 R
-only 段,产生 3 个加载段并提高了安全性.这是一个典型的映射:
Newer Linux systems, in particular using LLD
to link binaries, prioritize security over speed, and put ELF header and .rodata
into the first R
-only segment, resulting in 3 load segments and improved security. Here is a typical mapping:
$ echo "int foo; int main() { return 0;}" | clang -xc - -o a.out-lld -fuse-ld=lld
$ readelf -Wl a.out-lld
Elf file type is EXEC (Executable file)
Entry point 0x201000
There are 10 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000200040 0x0000000000200040 0x000230 0x000230 R 0x8
INTERP 0x000270 0x0000000000200270 0x0000000000200270 0x00001c 0x00001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x000000 0x0000000000200000 0x0000000000200000 0x000558 0x000558 R 0x1000
LOAD 0x001000 0x0000000000201000 0x0000000000201000 0x000185 0x000185 R E 0x1000
LOAD 0x002000 0x0000000000202000 0x0000000000202000 0x001170 0x002005 RW 0x1000
DYNAMIC 0x003010 0x0000000000203010 0x0000000000203010 0x000150 0x000150 RW 0x8
GNU_RELRO 0x003000 0x0000000000203000 0x0000000000203000 0x000170 0x001000 R 0x1
GNU_EH_FRAME 0x000440 0x0000000000200440 0x0000000000200440 0x000034 0x000034 R 0x1
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0
NOTE 0x00028c 0x000000000020028c 0x000000000020028c 0x000020 0x000020 R 0x4
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .rodata .dynsym .gnu.version .gnu.version_r .gnu.hash .hash .dynstr .rela.dyn .eh_frame_hdr .eh_frame
03 .text .init .fini
04 .data .tm_clone_table .fini_array .init_array .dynamic .got .bss
05 .dynamic
06 .fini_array .init_array .dynamic .got
07 .eh_frame_hdr
08
09 .note.ABI-tag
不要落后,较新的 BFD-ld(我的版本是 2.31.1)也使 ELF
标头和 .rodata
只读,但未能将两个 R
-only 段合并为一个,产生 4 个可加载段:
Not to be left behind, the newer BFD-ld (my version is 2.31.1) also makes ELF
header and .rodata
read-only, but fails to merge two R
-only segments into one, resulting in 4 loadable segments:
$ echo "int foo; int main() { return 0;}" | clang -xc - -o a.out-bfd -fuse-ld=bfd
$ readelf -Wl a.out-bfd
Elf file type is EXEC (Executable file)
Entry point 0x401020
There are 11 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x000268 0x000268 R 0x8
INTERP 0x0002a8 0x00000000004002a8 0x00000000004002a8 0x00001c 0x00001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0003f8 0x0003f8 R 0x1000
LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x00018d 0x00018d R E 0x1000
LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x000110 0x000110 R 0x1000
LOAD 0x002e40 0x0000000000403e40 0x0000000000403e40 0x0001e8 0x0001f0 RW 0x1000
DYNAMIC 0x002e50 0x0000000000403e50 0x0000000000403e50 0x0001a0 0x0001a0 RW 0x8
NOTE 0x0002c4 0x00000000004002c4 0x00000000004002c4 0x000020 0x000020 R 0x4
GNU_EH_FRAME 0x002004 0x0000000000402004 0x0000000000402004 0x000034 0x000034 R 0x4
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
GNU_RELRO 0x002e40 0x0000000000403e40 0x0000000000403e40 0x0001c0 0x0001c0 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn
03 .init .text .fini
04 .rodata .eh_frame_hdr .eh_frame
05 .init_array .fini_array .dynamic .got .got.plt .data .bss
06 .dynamic
07 .note.ABI-tag
08 .eh_frame_hdr
09
10 .init_array .fini_array .dynamic .got
最后,其中一些选择受 --(no)rosegment
(或 BFD ld<的
-Wl,z,noseparate-code
影响/code>) 链接器选项.
Finally, some of these choices are affected by the --(no)rosegment
(or -Wl,z,noseparate-code
for BFD ld
) linker option.
这篇关于为什么 ELF 可执行文件可以有 4 个 LOAD 段?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!