VMA和ELF段之间的关系 [英] relationship between VMA and ELF segments

查看:233
本文介绍了VMA和ELF段之间的关系的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要确定音乐录影带大奖的ELF可执行文件的加载段。 VMA的可以从打印/ proc /进程/图。通过地图可加载段所示的VMA之间的关系也很清楚我的。每个段由一个或多个VMA的。什么是使用的内核从ELF段形成VMA的方法:whteher它只考虑权限/标志或别的东西,还需要?按我的理解,用彩旗段读取,执行(code)将在单独的VMA去有相同的权限。虽然与权限下一段的读,写(数据)应走在其他VMA。但是,这是不是与第二可装入段的情况下,它通常是在两个或两个以上的VMA的分裂:有的用读取和写入,而其他与只读。所以,我的假设,即标志对VMA一代唯一的罪魁祸首似乎是错误的。我需要帮助理解段和音乐录影带大奖之间的这种关系。

我想要做的就是以编程方式确定音乐录影带大奖与出加载在内存中的ELF装入段。因此,在这个方向的任何指针/帮助是这篇文章的主要目标。


解决方案

一个VMA是虚拟内存与均匀区域:


  • 相同的权限( PROT_EXEC 等);


  • 同类型的( MAP_SHARED / MAP_PRIVATE );


  • 同样的备份文件(如有);


  • 一个一致的文件中的偏移量。


例如,如果你有一个VMA是 RW​​ 则mprotect PROT_READ (删除写的权限)的一部分在VMA的中间,内核会被分割在三个音乐录影带大奖(VMA的第一个是 RW​​ ,第二个研究最后 RW​​ )。

让我们看看一个可执行的典型VMA:

 $执行cat / proc / $$ /图
00400000-004f2000 R-XP 00000000 08:01 524453 /斌/庆典
006f1000-006f2000 - [R - P 000f1000 08:01 524453 /斌/庆典
006f2000-006fb000 RW-P 000f2000 08:01 524453 /斌/庆典
006fb000-00702000 RW-P 00000000 00:00 0
[...]

第一VMA是文本段。第二,第三和第四VMA的是数据段。

匿名映射的.bss

在这个过程开始时,你会有这样的事情:

 $执行cat / proc / $$ /图
00400000-004f2000 R-XP 00000000 08:01 524453 /斌/庆典
006f1000-006fb000 RW-P 000f1000 08:01 524453 /斌/庆典
006fb000-00702000 RW-P 00000000 00:00 0
[...]


  • 006f1000-006fb000 是来自可执行文件中的文本段的一部分。


  • 006fb000-00702000 不是可执行文件present,因为它最初是零填充。该方法的非初始化的变量都是组合在一起(在的.bss 段),并在为了节省空间未重新在可执行文件psented $ P $( 1)。


这来自可执行文件的程序头表的 PT_LOAD 项( readelf -l <​​/ code>)的形容段映射到内存:


型胶印VirtAddr PhysAddr
        FileSiz MemSiz标记对齐
[...]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
        0x00000000000f1a74 0x00000000000f1a74řË200000
LOAD 0x00000000000f1de0 0x00000000006f1de0 0x00000000006f1de0
        0x0000000000009068 0x000000000000f298 RW 200000
[...]

如果你看一下相应的 PT_LOAD 项,你会发现,该段的一部分,是不会再在该文件中psented $ P $(由于文件大小比存储器大小)小。

数据段而不是可执行文件与零初始化的一部分:动态连接器使用用于数据段的这部分的 MAP_ANONYMOUS 映射。这就是为什么显示为一个单独的VMA(它不具有相同的备份文件)。

搬迁保护( PT_GNU_RELRO

在动态,连接器已完成做迁移(2),它可能标志着数据段(除其他外 .GOT 部分)作为读的某一部分只是为了避免GOT-中毒攻击或漏洞。动态链接则mprotect(地址:应该由程序头表的 PT_GNU_RELRO 条目中描述的重定位后,被保护的数据段的部分,LEN,PROT_READ)完成搬迁后,某一区域(3)。这则mprotect 呼叫分为二的VMA(第二VMA第一个研究,第二个 RW )。


型胶印VirtAddr PhysAddr
            FileSiz MemSiz标记对齐
[...]
GNU_RELRO 0x00000000000f1de0 0x00000000006f1de0 0x00000000006f1de0
            0x0000000000000220 0x0000000000000220ř
[...]

摘要

音乐录影带大奖


00400000-004f2000 R-XP 00000000 8时01 524453 /斌/庆典
006f1000-006f2000 - [R - P 000f1000 08:01 524453 /斌/庆典
006f2000-006fb000 RW-P 000f2000 08:01 524453 /斌/庆典
006fb000-00702000 RW-P 00000000 00:00 0

VirtAddr MemSiz 标记在 PT_LOAD PT_GNU_RELRO 条目的字段:


型胶印VirtAddr PhysAddr
           FileSiz MemSiz标记对齐
[...]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
           0x00000000000f1a74 0x00000000000f1a74řË200000
LOAD 0x00000000000f1de0 0x00000000006f1de0 0x00000000006f1de0
           0x0000000000009068 0x000000000000f298 RW 200000
[...]
GNU_RELRO 0x00000000000f1de0 0x00000000006f1de0 0x00000000006f1de0
          0x0000000000000220 0x0000000000000220ř
[...]


  1. 首先 PT_LOAD 项流程。他们每个人都用 MMAP触发一个VMA的创建()。此外,如果 MemSiz&GT; FileSiz ,它可以创建一个额外的匿名VMA。


  2. 然后,所有(也只有一次初步实践) PT_GNU_RELRO 的过程。他们每个人都触发则mprotect()通话可能将现有VMA分割成不同的音乐录影带大奖。


为了做你想做的,正确的方法可能是模拟 MMAP 则mprotect 调用:

  //虚拟内存区域:
结构VMA {
  的std :: uint64_t中地址,长度;
  标准::字符串FILE_NAME;
  INT PROT;
  诠释标志;
  的std :: uint64_t中抵消;
};//虚拟地址空间:
类输精管{
私人的:
  性病::名单&LT;&VMA GT; vmas_;
上市:
  VMA和放大器; MMAP(
    的std :: uint64_t中地址,性病:: uint64_t中长,INT PROT,
    INT标志,INT FD,off_t偏移);
  INT则mprotect(的std :: uint64_t中地址,性病:: uint64_t中LEN,INT PROT);
  性病::名单&LT;&VMA GT;常量和放大器;音乐录影带大奖()const的{返回vmas_; }
};为(Elf32_Phdr常量&放大器; H:PHDRS)
  如果(h.p_type == PT_LOAD){
    vas.mmap(...);
    如果(anon_size)
      vas.mmap(...);
  }
为(Elf32_Phdr常量&放大器; H:PHDRS)
  如果(h.p_type == PT_GNU_RELRO)
    vas.mprotect(...);

计算的一些实例

地址是略有不同,因为音乐录影带大奖是页对齐(3)(使用4Kio = 0×1000页的x86和x86_64):

第一VMA是由第一 PT_LOAD 条目描述:

  VMA [0] =。开始page_floor(负载[0] .virt_addr)
             =为0x400000VMA [0] = .END page_ceil(负载[1] .virt_addr +负载[1] .phys_size)
           = page_ceil(为0x400000 + 0xf1a74)
           = page_ceil(0x4f1a74)
           = 0x4f2000

接下来的VMA是为得到保护,并通过 PT_GNU_RELRO 描述的数据段的一部分:

  VMA [1]。开始= page_floor(relro [0] .virt_addr)
             = page_floor(0xf1de0)
             = 0x6f1000VMA [1] .end关于= page_ceil(relro [0] .virt_addr + RELO [0] .mem_size)
           = page_ceil(0x6f1de0 + 0x220)
           = page_ceil(0x6f2000)
           = 0x6f2000

[...]

函授教育与节

节头:
  [NR名称类型地址偏移
       大小EntSize标志链接信息对齐
  [0] NULL 0000000000000000 00000000
       0000000000000000 0000000000000000 0 0 0
  [1] .interp PROGBITS 0000000000400238 00000238
       000000000000001c 0000000000000000 A 0 0 1
  [2] .note.ABI标签注0000000000400254 00000254
       0000000000000020 0000000000000000 A 0 0 4
  [3] .note.gnu.build-i注0000000000400274 00000274
       0000000000000024 0000000000000000 A 0 0 4
  [4]的.gnu.hash GNU_HASH 0000000000400298 00000298
       0000000000004894 0000000000000000有5 0 8
  [5]显.dynsym DYNSYM 0000000000404b30 00004b30
       000000000000d6c8 0000000000000018 A 6 1 8
  [6]的.dynstr STRTAB 00000000004121f8 000121f8
       0000000000008c25 0000000000000000 A 0 0 1
  [7] .gnu.version VERSYM 000000000041ae1e 0001ae1e
       00000000000011e6 0000000000000002有5 0 2
  [8] .gnu.version_r VERNEED 000000000041c008 0001c008
       00000000000000b0 0000000000000000 A 6 2 8
  [9] .rela.dyn RELA 000000000041c0b8 0001c0b8
       00000000000000c0 0000000000000018有5 0 8
  [10] .rela.plt RELA 000000000041c178 0001c178
       00000000000013f8 0000000000000018 AI 5月12日8
  [11] .init PROGBITS 000000000041d570 0001d570
       000000000000001a 0000000000000000 AX 0 0 4
  [12] .PLT PROGBITS 000000000041d590 0001d590
       0000000000000d60 0000000000000010 AX 0 0 16
  [13]的.text PROGBITS 000000000041e2f0 0001e2f0
       0000000000099c42 0000000000000000 AX 0 0 16
  [14]调用.fini PROGBITS 00000000004b7f34 000b7f34
       0000000000000009 0000000000000000 AX 0 0 4
  [15] .RODATA PROGBITS 00000000004b7f40 000b7f40
       000000000001ebb0 0000000000000000 A 0 0 64
  [16] .eh_frame_hdr PROGBITS 00000000004d6af0 000d6af0
       000000000000407c 0000000000000000 A 0 0 4
  [17] .eh_frame PROGBITS 00000000004dab70 000dab70
       0000000000016f04 0000000000000000 A 0 0 8
  [18] .init_array INIT_ARRAY 00000000006f1de0 000f1de0
       0000000000000008 0000000000000000 WA 0 0 8
  [19] .fini_array FINI_ARRAY 00000000006f1de8 000f1de8
       0000000000000008 0000000000000000 WA 0 0 8
  [20] .jcr PROGBITS 00000000006f1df0 000f1df0
       0000000000000008 0000000000000000 WA 0 0 8
  [21]。动态动态00000000006f1df8 000f1df8
       0000000000000200 0000000000000010 WA 6 0 8
  [22] .GOT PROGBITS 00000000006f1ff8 000f1ff8
       0000000000000008 0000000000000008 WA 0 0 8
  [23] .got.plt PROGBITS 00000000006f2000 000f2000
       00000000000006c0 0000000000000008 WA 0 0 8
  [24]。数据PROGBITS 00000000006f26c0 000f26c0
       0000000000008788 0000000000000000 WA 0 0 64
  [25] .BSS NOBITS 00000000006fae80 000fae48
       00000000000061f8 0000000000000000 WA 0 0 64
  [26] .shstrtab STRTAB 0000000000000000 000fae48
       00000000000000ef 0000000000000000 0 0 1

据您的VMA的范围比较段的地址( readelf -S ),你会发现映射:


00400000-004f2000 R-XP /斌/ bash下.interp,.note.ABI标签,.note.gnu.build-ID的.gnu.hash,显.dynsym,的.dynstr,.gnu.version,.gnu.version_r ,.rela.dyn,.rela.plt,.init,.PLT,.text段,调用.fini,.rodata.eh_frame_hdr,.eh_frame
006f1000-006f2000的R - P /斌/ bash下.init_array,.fini_array,.jcr,。动态,.GOT
006f2000-006fb000 RW-P /斌/ bash下.got.plt,。数据,刚开始的.bss的
006fb000-00702000 RW-P - :.bss中的休息

注释

(1):事实上,它的更复杂:一个在的.bss 部分的一部分可能会重新在页面对齐的原因的可执行文件psented $ P $。

(2):事实上,当它已经完成做非延迟搬迁

(3):MMU操作使用页面粒度让 MMAP的内存范围()则mprotect()则munmap()通话扩展至全页面。

I need to determine the VMAs for loadable segments of ELF executables. VMAs can be printed from /proc/pid/maps. The relationship between VMAs shown by maps with loadable segments is also clear to me. Each segment consists of one or more VMAs. what is the method used by kernel to form VMAs from ELF segments: whteher it takes into consideration only permissions/flags or something else is also required? As per my understanding, a segment with flags Read, Execute(code) will go in separate VMA having same permission. While next segment with permissions Read, Write(data) should go in an other VMA. But this is not case with second loadable segment, it is usually splitted in two or more VMAs: some with read and write while other with read only. So My assumption that flags are the only culprit for VMA generation seems wrong. I need help to understand this relationship between segments and VMAs.

What I want to do is to programmatically determine the VMAs for loadable segments of ELF with out loading it in memory. So any pointer/help in this direction is the main objective of this post.

解决方案

A VMA is a homogeneous region of virtual memory with:

  • the same permissions (PROT_EXEC, etc.);

  • the same type (MAP_SHARED/MAP_PRIVATE);

  • the same backing file (if any);

  • a consistent offset within the file.

For example, if you have a VMA which is RW and you mprotect PROT_READ (you remove the permission to write) a part in the middle of the VMA, the kernel will split the VMA in three VMAs (the first one being RW, the second R and the last RW).

Let's look at a typical VMA from an executable:

$ cat /proc/$$/maps
00400000-004f2000 r-xp 00000000 08:01 524453     /bin/bash
006f1000-006f2000 r--p 000f1000 08:01 524453     /bin/bash
006f2000-006fb000 rw-p 000f2000 08:01 524453     /bin/bash
006fb000-00702000 rw-p 00000000 00:00 0
[...]

The first VMA is the text segment. The second, third and fourth VMAs are the data segment.

Anonymous mapping for .bss

At the beginning of the process, you will have something like this:

$ cat /proc/$$/maps
00400000-004f2000 r-xp 00000000 08:01 524453     /bin/bash
006f1000-006fb000 rw-p 000f1000 08:01 524453     /bin/bash
006fb000-00702000 rw-p 00000000 00:00 0
[...]

  • 006f1000-006fb000 is the part of the text segment which comes from the executable file.

  • 006fb000-00702000 is not present in the executable file because it is initially filled with zeroes. The non-initialized variables of the process are all grouped together (in the .bss segment) and are not represented in the executable file in order to save space (1).

This come from the PT_LOAD entries of the program header table of the executable file (readelf -l) which describe the segments to map into memory:

Type    Offset             VirtAddr           PhysAddr
        FileSiz            MemSiz              Flags  Align
[...]
LOAD    0x0000000000000000 0x0000000000400000 0x0000000000400000
        0x00000000000f1a74 0x00000000000f1a74  R E    200000
LOAD    0x00000000000f1de0 0x00000000006f1de0 0x00000000006f1de0
        0x0000000000009068 0x000000000000f298  RW     200000
[...]

If you look at the corresponding PT_LOAD entry, you will notice that a part of the the segment is not represented in the file (because the file size is smaller than the memory size).

The part of the data segment which is not in the executable file is initialized with zeros: the dynamic linker uses a MAP_ANONYMOUS mapping for this part of the data segment. This is why is appears as a separate VMA (it does not have the same backing file).

Relocation protection (PT_GNU_RELRO)

When the dynamic, linker has finished doing the relocations (2), it might mark some part of the data segment (the .got section among others) as read-only in order to avoid GOT-poisoning attacks or bugs. The section of the data segment which should be protected after the relocations in described by the PT_GNU_RELRO entry of the program header table: the dynamic linker mprotect(addr, len, PROT_READ) the given region after finishing the relocations (3). This mprotect call splits the second VMA in two VMAs (the first one R and the second one RW).

Type        Offset             VirtAddr           PhysAddr
            FileSiz            MemSiz             Flags  Align
[...]
GNU_RELRO   0x00000000000f1de0 0x00000000006f1de0 0x00000000006f1de0
            0x0000000000000220 0x0000000000000220  R
[...]

Summary

The VMAs

00400000-004f2000 r-xp 00000000 08:01 524453     /bin/bash
006f1000-006f2000 r--p 000f1000 08:01 524453     /bin/bash
006f2000-006fb000 rw-p 000f2000 08:01 524453     /bin/bash
006fb000-00702000 rw-p 00000000 00:00 0

are derived from the VirtAddr, MemSiz and Flags fields of the PT_LOAD and PT_GNU_RELRO entries:

Type       Offset             VirtAddr           PhysAddr
           FileSiz            MemSiz              Flags  Align
[...]
LOAD       0x0000000000000000 0x0000000000400000 0x0000000000400000
           0x00000000000f1a74 0x00000000000f1a74  R E    200000
LOAD       0x00000000000f1de0 0x00000000006f1de0 0x00000000006f1de0
           0x0000000000009068 0x000000000000f298  RW     200000
[...]
GNU_RELRO 0x00000000000f1de0 0x00000000006f1de0 0x00000000006f1de0
          0x0000000000000220 0x0000000000000220  R
[...]

  1. First all PT_LOAD entries are processes. Each of them triggers the creation of one VMA by using a mmap(). In addition, if MemSiz > FileSiz, it might create an additional anonymous VMA.

  2. Then all (well there is only once in pratice) PT_GNU_RELRO are processes. Each of them triggers a mprotect() call which might split an existing VMA into different VMAs.

In order to do what you want, the correct way is probably to simulate the mmap and mprotect calls:

// Virtual Memory Area:
struct Vma {
  std::uint64_t addr, length;
  std::string file_name;
  int prot;
  int flags;
  std::uint64_t offset;
};

// Virtual Address Space:
class Vas {
private:
  std::list<Vma> vmas_;
public:
  Vma& mmap(
    std::uint64_t addr, std::uint64_t length, int prot,
    int flags, int fd, off_t offset);
  int mprotect(std::uint64_t addr, std::uint64_t len, int prot);
  std::list<Vma> const& vmas() const { return vmas_; }
};

for (Elf32_Phdr const& h : phdrs)
  if (h.p_type == PT_LOAD) {
    vas.mmap(...);
    if (anon_size)
      vas.mmap(...); 
  }  
for (Elf32_Phdr const& h : phdrs)
  if (h.p_type == PT_GNU_RELRO)
    vas.mprotect(...);  

Some examples of computations

The addresses are slightly different because the VMAs are page-aligned (3) (using 4Kio = 0x1000 pages for x86 and x86_64):

The first VMA is describes by the first PT_LOAD entry:

vma[0].start = page_floor(load[0].virt_addr)
             = 0x400000

vma[0].end = page_ceil(load[1].virt_addr + load[1].phys_size)
           = page_ceil(0x400000 + 0xf1a74)
           = page_ceil(0x4f1a74)
           = 0x4f2000

The next VMA is the part of the data segment which as been protected and is described by PT_GNU_RELRO:

vma[1].start = page_floor(relro[0].virt_addr)
             = page_floor(0xf1de0)
             = 0x6f1000

vma[1].end = page_ceil(relro[0].virt_addr + relo[0].mem_size)
           = page_ceil(0x6f1de0 + 0x220)
           = page_ceil(0x6f2000)
           = 0x6f2000

[...]

Correspondence with the sections

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .interp           PROGBITS         0000000000400238  00000238
       000000000000001c  0000000000000000   A       0     0     1
  [ 2] .note.ABI-tag     NOTE             0000000000400254  00000254
       0000000000000020  0000000000000000   A       0     0     4
  [ 3] .note.gnu.build-i NOTE             0000000000400274  00000274
       0000000000000024  0000000000000000   A       0     0     4
  [ 4] .gnu.hash         GNU_HASH         0000000000400298  00000298
       0000000000004894  0000000000000000   A       5     0     8
  [ 5] .dynsym           DYNSYM           0000000000404b30  00004b30
       000000000000d6c8  0000000000000018   A       6     1     8
  [ 6] .dynstr           STRTAB           00000000004121f8  000121f8
       0000000000008c25  0000000000000000   A       0     0     1
  [ 7] .gnu.version      VERSYM           000000000041ae1e  0001ae1e
       00000000000011e6  0000000000000002   A       5     0     2
  [ 8] .gnu.version_r    VERNEED          000000000041c008  0001c008
       00000000000000b0  0000000000000000   A       6     2     8
  [ 9] .rela.dyn         RELA             000000000041c0b8  0001c0b8
       00000000000000c0  0000000000000018   A       5     0     8
  [10] .rela.plt         RELA             000000000041c178  0001c178
       00000000000013f8  0000000000000018  AI       5    12     8
  [11] .init             PROGBITS         000000000041d570  0001d570
       000000000000001a  0000000000000000  AX       0     0     4
  [12] .plt              PROGBITS         000000000041d590  0001d590
       0000000000000d60  0000000000000010  AX       0     0     16
  [13] .text             PROGBITS         000000000041e2f0  0001e2f0
       0000000000099c42  0000000000000000  AX       0     0     16
  [14] .fini             PROGBITS         00000000004b7f34  000b7f34
       0000000000000009  0000000000000000  AX       0     0     4
  [15] .rodata           PROGBITS         00000000004b7f40  000b7f40
       000000000001ebb0  0000000000000000   A       0     0     64
  [16] .eh_frame_hdr     PROGBITS         00000000004d6af0  000d6af0
       000000000000407c  0000000000000000   A       0     0     4
  [17] .eh_frame         PROGBITS         00000000004dab70  000dab70
       0000000000016f04  0000000000000000   A       0     0     8
  [18] .init_array       INIT_ARRAY       00000000006f1de0  000f1de0
       0000000000000008  0000000000000000  WA       0     0     8
  [19] .fini_array       FINI_ARRAY       00000000006f1de8  000f1de8
       0000000000000008  0000000000000000  WA       0     0     8
  [20] .jcr              PROGBITS         00000000006f1df0  000f1df0
       0000000000000008  0000000000000000  WA       0     0     8
  [21] .dynamic          DYNAMIC          00000000006f1df8  000f1df8
       0000000000000200  0000000000000010  WA       6     0     8
  [22] .got              PROGBITS         00000000006f1ff8  000f1ff8
       0000000000000008  0000000000000008  WA       0     0     8
  [23] .got.plt          PROGBITS         00000000006f2000  000f2000
       00000000000006c0  0000000000000008  WA       0     0     8
  [24] .data             PROGBITS         00000000006f26c0  000f26c0
       0000000000008788  0000000000000000  WA       0     0     64
  [25] .bss              NOBITS           00000000006fae80  000fae48
       00000000000061f8  0000000000000000  WA       0     0     64
  [26] .shstrtab         STRTAB           0000000000000000  000fae48
       00000000000000ef  0000000000000000           0     0     1

It you compare the Address of the sections (readelf -S) with the ranges of the VMAs, you find the mappings:

00400000-004f2000 r-xp /bin/bash : .interp, .note.ABI-tag, .note.gnu.build-id, .gnu.hash, .dynsym, .dynstr, .gnu.version, .gnu.version_r, .rela.dyn, .rela.plt, .init, .plt, .text, .fini, .rodata.eh_frame_hdr, .eh_frame
006f1000-006f2000 r--p /bin/bash : .init_array, .fini_array, .jcr, .dynamic, .got
006f2000-006fb000 rw-p /bin/bash : .got.plt, .data, beginning of .bss
006fb000-00702000 rw-p -         : rest of .bss

Notes

(1): In fact, its more complicated: a part of the .bss section might be represented in the executable file for page alignment reasons.

(2): In fact, when it has finished doing the non-lazy relocations.

(3): MMU operations are using the page-granularity so the memory ranges of mmap(), mprotect(), munmap() calls are extended to cover full-pages.

这篇关于VMA和ELF段之间的关系的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆