ELF 程序头虚拟地址和文件偏移量 [英] ELF program header virtual address and file offset
问题描述
我知道两者的关系:
虚拟地址 mod 页面对齐 == 文件偏移量 mod 页面对齐
virtual address mod page alignment == file offset mod page alignment
但是谁能告诉我这两个数字是在哪个方向计算的?
But can someone tell me in which direction are these two numbers computed?
虚拟地址是根据上面的关系从文件偏移量计算出来的,还是反之?
Is virtual address computed from file offset according to the relationship above, or vice versa?
这里有一些更详细的信息:链接器在写入ELF文件头时,会设置程序头的虚拟地址和文件偏移量.(段)
Here is some more detail: when the linker writes the ELF file header, it sets the virtual address and file offset of the program headers.(segments)
例如有 readelf -l someELFfile
的输出:
Elf file type is EXEC (Executable file)
Entry point 0x8048094
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x08048000 0x08048000 0x00154 0x00154 R E 0x1000
LOAD 0x000154 0x08049154 0x08049154 0x00004 0x00004 RW 0x1000
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x10
我们可以看到 2 个 LOAD 段.
We can see 2 LOAD segments.
第一个 LOAD 的虚拟地址结束于 0x8048154,而第二个 LOAD 开始于 0x8049154.
The virtual address of the first LOAD ends at 0x8048154, while the second LOAD starts at 0x8049154.
在 ELF 文件中,第二个 LOAD 紧随第一个 LOAD 之后,文件偏移量为 0x00154,但是当此 ELF 加载到内存中时,它从第一个 LOAD 段结束后的 0x1000 字节开始.
In the ELF file, the second LOAD is right behind the first LOAD with file offset 0x00154, however when this ELF is loaded into memory it starts at 0x1000 bytes after the end of the first LOAD segment.
但是,为什么?如果我们必须考虑内存页面对齐,为什么第二个 LOAD 段不是从 0x80489000 开始?为什么它从第一个 LOAD 段结束后的 0x1000 字节开始?
But, why? If we have to consider memory page alignment, why doesn't the second LOAD segment starts at 0x80489000? Why does it start at 0x1000 bytes AFTER THE END of the first LOAD segment?
我知道第二个LOAD的虚拟地址满足关系:
I know the virtual address of the second LOAD satisfies the relationship:
虚拟地址 mod 页面对齐 == 文件偏移量 mod 页面对齐
virtual address mod page alignment == file offset mod page alignment
但我不知道为什么这种关系必须满足.
But I don't know why this relationship must be satisfied.
推荐答案
为什么它从第一个 LOAD 段结束后的 0x1000 字节开始?
Why does it start at 0x1000 bytes AFTER THE END of the first LOAD segment?
如果不是,它必须从 0x08048154
开始,但它不能:两个 LOAD
段有不同的 flags 为其映射指定(第一个用 PROT_READ|PROT_EXEC
映射,第二个用 PROT_READ|PROTO_WRITE
映射.保护(作为页表的一部分)只能应用于 整个页面,而不是页面的一部分.因此,具有不同保护的映射必须属于不同的页面.
If it didn't, it would have to start at 0x08048154
, but it can't: the two LOAD
segments have different flags specified for their mapping (the first is mapped with PROT_READ|PROT_EXEC
, the second with PROT_READ|PROTO_WRITE
. Protections (being part of the page table) can only apply to whole pages, not parts of a page. Therefore, the mappings with different protections must belong to different pages.
虚拟地址 mod 页面对齐 == 文件偏移量 mod 页面对齐
但是不知道为什么一定要满足这种关系.
virtual address mod page alignment == file offset mod page alignment
But I don't know why this relationship must be satisfied.
LOAD
段是直接从文件 mmap
编辑的.为您的示例执行的第二个 LOAD
段的实际映射将如下所示(您可以在 strace
和 see 下运行您的程序确实):
The LOAD
segments are directly mmap
ed from file. The actual mapping of the second LOAD
segment performed for your example will look something like this (you can run your program under strace
and see that it does):
mmap(0x08049000, 0x158, PROT_READ|PROT_WRITE, MAP_PRIVATE, $fd, 0)
如果您尝试使虚拟地址或偏移量不与页面对齐,mmap
将失败并返回 EINVAL
.使文件数据出现在所需地址的虚拟内存中的唯一方法是使 VirtAddr
与 Offset
模 Align
一致,这正是静态链接器的作用.
If you try to make the virtual address or the offset non-page-aligned, mmap
will fail with EINVAL
. The only way to make file data to appear in virtual memory at desired address it to make VirtAddr
congruent to Offset
modulo Align
, and that is exactly what the static linker does.
请注意,对于这么小的第一个 LOAD
段,整个第一个段也出现在 second 映射的开头(带有错误的保护).但是该程序不应该访问 [0x08049000,0x08049154)
范围内的任何内容.一般来说,在第二个 LOAD
段的 actual data 开始之前几乎总是有一些垃圾"(除非你真的很幸运并且第一个LOAD
段在页面边界处结束).
Note that for such a small first LOAD
segment, the entire first segment also appears at the beginning of the second mapping (with the wrong protections). But the program is not supposed to access anything in the [0x08049000,0x08049154)
range. In general, it is almost always the case that there is some "junk" before the start of actual data in the second LOAD
segment (unless you get really lucky and the first LOAD
segment ends on a page boundary).
另请参见 mmap 手册页.
这篇关于ELF 程序头虚拟地址和文件偏移量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!