ELF程序头部虚拟地址和文件偏移量 [英] ELF program header virtual address and file offset

查看:848
本文介绍了ELF程序头部虚拟地址和文件偏移量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道两者之间的关系:

lockquote
虚拟地址 mod 页面对齐==文件偏移量 mod 页面对齐


但是有人可以告诉我这两个数字在哪个方向上计算出来吗?

是否根据上述关系计算出文件偏移的虚拟地址,反之亦然?

更新 h2>

这里有一些细节:当链接器写入ELF文件头时,它设置程序头文件的虚拟地址和文件偏移量(段)



例如,输出 readelf -l someELFfile

  Elf文件类型是EXEC(可执行文件)
入口点0x8048094
程序头文件:
类型偏移量VirtAddr PhysAddr FileSiz MemSiz Flg对齐
LOAD 0x000000 0x08048000 0x08048000 0x00154 0x00154 RE 0x1000
LOAD 0x000154 0x08049154 0x08049154 0x00004 0x00004 RW 0x1000
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x10

我们可以看到2个LOAD段。



第一个LOAD的虚拟地址在0x8048154结束,第二个LOAD在0x8049154开始。



ELF文件,第二个LOAD在文件偏移量为0x00154的第一个LOAD后面,但是当这个ELF加载到内存时,它将在第一个LOAD段结束后的0x1000字节处开始。



但是,为什么?如果我们必须考虑内存页面对齐,为什么第二个LOAD段从0x80489000开始?为什么它从0x1000字节开始到第一个LOAD段的结尾?



我知道第二个LOAD的虚拟地址满足以下关系:


虚拟地址 mod 页面对齐==文件偏移 mod 页面对齐

但我不知道为什么这种关系必须得到满足。


为什么在第一个LOAD段结束后,它从0x1000个字节开始?

如果它没有,它必须从 0x08048154 开始,但它不能:两个 LOAD 段有为其映射指定不同的 flags (第一个映射为 PROT_READ | PROT_EXEC ,第二个映射为 PROT_READ | PROTO_WRITE 。保护(作为页表的一部分)只能应用于整个页面,而不是页面的一部分,因此,mapp具有不同保护的必须属于不同的页面。


虚拟地址mod页面对齐==文件偏移mod页面对齐

但我不知道为什么这个关系必须得到满足。


LOAD 段直接来自文件的 mmap 。为你的例子执行的第二个 LOAD 段的实际映射看起来像这样(你可以在> strace ):

  mmap(0x08049000,0x158,PROT_READ | PROT_WRITE,MAP_PRIVATE ,$ fd,0)

如果您尝试将虚拟地址或偏移量设置为non-page-对齐, mmap 将会以 EINVAL 失败。使文件数据在虚拟内存中出现的所需地址使它 VirtAddr 一致到 Offset modulo Align ,这正是静态链接器所做的。



请注意,对于这样一个小的第一个 LOAD 段,整个第一段也出现在第二个映射的开头(带有错误的保护)。但该程序不应该访问 [0x08049000,0x08049154)范围内的任何内容。通常情况下,在第二个 LOAD 段中,实际数据开始之前会有一些垃圾(除非您非常幸运,第一个 LOAD 段结束于页面边界)。 mmap手册页


I know the relationship between the two:

virtual address mod page alignment == file offset mod page alignment

But can someone tell me in which direction are these two numbers computed?

Is virtual address computed from file offset according to the relationship above, or vice versa?

Update

Here is some more detail: when the linker writes the ELF file header, it sets the virtual address and file offset of the program headers.(segments)

For example there's the output of readelf -l someELFfile:

Elf file type is EXEC (Executable file)
Entry point 0x8048094
Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000000 0x08048000 0x08048000 0x00154 0x00154 R E 0x1000
  LOAD           0x000154 0x08049154 0x08049154 0x00004 0x00004 RW  0x1000
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x10

We can see 2 LOAD segments.

The virtual address of the first LOAD ends at 0x8048154, while the second LOAD starts at 0x8049154.

In the ELF file, the second LOAD is right behind the first LOAD with file offset 0x00154, however when this ELF is loaded into memory it starts at 0x1000 bytes after the end of the first LOAD segment.

But, why? If we have to consider memory page alignment, why doesn't the second LOAD segment starts at 0x80489000? Why does it start at 0x1000 bytes AFTER THE END of the first LOAD segment?

I know the virtual address of the second LOAD satisfies the relationship:

virtual address mod page alignment == file offset mod page alignment

But I don't know why this relationship must be satisfied.

解决方案

Why does it start at 0x1000 bytes AFTER THE END of the first LOAD segment?

If it didn't, it would have to start at 0x08048154, but it can't: the two LOAD segments have different flags specified for their mapping (the first is mapped with PROT_READ|PROT_EXEC, the second with PROT_READ|PROTO_WRITE. Protections (being part of the page table) can only apply to whole pages, not parts of a page. Therefore, the mappings with different protections must belong to different pages.

virtual address mod page alignment == file offset mod page alignment
But I don't know why this relationship must be satisfied.

The LOAD segments are directly mmaped from file. The actual mapping of the second LOAD segment performed for your example will look something like this (you can run your program under strace and see that it does):

mmap(0x08049000, 0x158, PROT_READ|PROT_WRITE, MAP_PRIVATE, $fd, 0)

If you try to make the virtual address or the offset non-page-aligned, mmap will fail with EINVAL. The only way to make file data to appear in virtual memory at desired address it to make VirtAddr congruent to Offset modulo Align, and that is exactly what the static linker does.

Note that for such a small first LOAD segment, the entire first segment also appears at the beginning of the second mapping (with the wrong protections). But the program is not supposed to access anything in the [0x08049000,0x08049154) range. In general, it is almost always the case that there is some "junk" before the start of actual data in the second LOAD segment (unless you get really lucky and the first LOAD segment ends on a page boundary).

See also mmap man page.

这篇关于ELF程序头部虚拟地址和文件偏移量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆