为什么存储超过 BSS 末尾时没有出现分段错误? [英] Why didn't I get segmentation fault when storing past the end of the BSS?

查看:12
本文介绍了为什么存储超过 BSS 末尾时没有出现分段错误?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在试验汇编语言并编写了一个程序,它将 2 个硬编码字节打印到标准输出中.这里是:

I'm experimenting with assembly language and wrote a program which prints 2 hardcoded bytes into stdout. Here it is:

section .text
     global _start

_start:
     mov eax, 0x0A31
     mov [val], eax
     mov eax, 4
     mov ebx, 1
     mov ecx, val
     mov edx, 2

     int 0x80

     mov eax, 1
     int 0x80

 segment .bss
     val resb 1;   <------ Here

请注意,我在 bss 段内只保留了 1 个字节,但实际上将 2 个字节(1newline 符号的字符代码)放入内存位置.该程序运行良好.它打印 1 字符,然后打印 newline.

Note that I reserved only 1 byte inside the bss segment, but actually put 2 bytes (charcode for 1 and newline symbol) into the memory location. And the program worked fine. It printed 1 character and then newline.

但我预计会出现分段错误.为什么没有发生.我们只保留了 1 个字节,但放了 2 个.

But I expected segmentation fault. Why isn't it occured. We reserved only 1 byte, but put 2.

推荐答案

x86 与大多数其他现代架构一样,使用 用于内存保护的分页/虚拟内存. 在 x86 上(与许多其他架构一样),粒度为 4kiB.

x86, like most other modern architectures, uses paging / virtual memory for memory protection. On x86 (again like many other architectures), the granularity is 4kiB.

val 的 4 字节存储不会出错,除非链接器碰巧将它放在页面的最后 3 个字节中,并且下一页未映射.

A 4-byte store to val won't fault unless the linker happens to place it in the last 3 bytes of a page, and the next page is unmapped.

实际发生的情况是,您只是覆盖了 val 之后的内容.在这种情况下,它只是页面末尾的未使用空间.如果您在 BSS 中有其他静态存储位置,您会踩到它们的值.(如果你愿意,可以称它们为变量",但变量"的高级概念并不仅仅意味着内存位置,变量可以存在于寄存器中并且永远不需要有地址.)

What actually happens is that you just overwrite whatever is after val. In this case, it's just unused space to the end of the page. If you had other static storage locations in the BSS, you'd step on their values. (Call them "variables" if you want, but the high-level concept of a "variable" doesn't just mean a memory location, a variable can be live in a register and never needs to have an address.)

除了上面链接的维基百科文章,另请参阅:

Besides the wikipedia article linked above, see also:

  • How does x86 paging work? (internals of the page-table format, and how the OS manages it and the CPU reads it).
  • What is the state of the art in Memory Protection?
  • Is it safe to read past the end of a buffer within the same page on x86 and x64?
  • About the memory layout of programs in Linux

但实际上将 2 个字节(1 的字符代码和换行符)放入内存位置.

but actually put 2 bytes (charcode for 1 and newline symbol) into the memory location.

mov [val], eax 是一个 4 字节的存储.操作数大小由寄存器决定.如果你想做一个 2 字节的存储,使用 mov [val], ax.

mov [val], eax is a 4-byte store. The operand-size is determined by the register. If you wanted to do a 2-byte store, use mov [val], ax.

有趣的事实:MASM 会就操作数大小不匹配发出警告或错误,因为它根据在其后保留空间的声明神奇地将大小与符号名称相关联.NASM 不会妨碍你,所以如果你写 mov [val], 0x0A31,那将是一个错误.两个操作数都不意味着大小,因此您需要 mov dword [val], 0x0A31(或 wordbyte).

Fun fact: MASM would warn or error about an operand-size mismatch, because it magically associates sizes with symbol names based on the declaration that reserves space after them. NASM stays out of your way, so if you wrote mov [val], 0x0A31, it would be an error. Neither operand implies a size, so you need mov dword [val], 0x0A31 (or word or byte).

由于某种原因,BSS 在 32 位二进制文​​件中不是从页面的开头开始,而是在页面的开头附近.您没有链接到任何会占用 BSS 中大部分页面的其他内容.nm bss-no-segfault 表明它在 0x080490a8,一个 4k 页面是 0x1000 字节,所以 BSS 映射中的最后一个字节将是 0x08049fff.

The BSS for some reason doesn't start at the beginning of a page in a 32-bit binary, but it is near the start of a page. You're not linking with anything else that would use up most of a page in the BSS. nm bss-no-segfault shows that it's at 0x080490a8, and a 4k page is 0x1000 bytes, so the last byte in the BSS mapping will be 0x08049fff.

当我向 .text 部分添加指令时,BSS 起始地址似乎发生了变化,因此这里的链接器选择可能与将内容打包到 ELF 可执行文件中有关.这没有多大意义,因为BSS没有存储在文件中,它只是一个基地址+长度.我不会去那个兔子洞;我确信使 .text 稍大会导致 BSS 从页面开头开始是有原因的,但 IDK 是什么.

It seems that the BSS start address changes when I add an instruction to the .text section, so presumably the linker's choices here are related to packing things into an ELF executable. It doesn't make much sense, because the BSS isn't stored in the file, it's just a base address + length. I'm not going down that rabbit hole; I'm sure there's a reason that making .text slightly larger results in a BSS that starts at the beginning of a page, but IDK what it is.

无论如何,如果我们构造 BSS 以使 val 正好在页面结束之前,我们就会出错:

Anyway, if we construct the BSS so that val is right before the end of a page, we can get a fault:

... same .text

section .bss
dummy:  resb 4096 - 0xa8 - 2
val:    resb 1

;; could have done this instead of making up constants
;; ALIGN 4096
;; dummy2: resb 4094
;; val2:   resb

然后构建并运行:

$ asm-link -m32 bss-no-segfault.asm
+ yasm -felf32 -Worphan-labels -gdwarf2 bss-no-segfault.asm
+ ld -melf_i386 -o bss-no-segfault bss-no-segfault.o

peter@volta:~/src/SO$ nm bss-no-segfault
080490a7 B __bss_start
080490a8 b dummy
080490a7 B _edata
0804a000 B _end         <---------  End of the BSS
08048080 T _start
08049ffe b val          <---------  Address of val

 gdb ./bss-no-segfault

 (gdb) b _start
 (gdb) r
 (gdb) set disassembly-flavor intel
 (gdb) layout reg

 (gdb) p &val
 $2 = (<data variable, no debug info> *) 0x8049ffe
 (gdb) si    # and press return to repeat a couple times

mov [var], eax 段错误,因为它进入了未映射的页面.mov [var], ax 会起作用(因为我把 var 放在页面末尾前 2 个字节).

mov [var], eax segfaults because it crosses into the unmapped page. mov [var], ax would works (because I put var 2 bytes before the end of the page).

此时,/proc/<PID>/smaps显示:

... the r-x private mapping for .text
08049000-0804a000 rwxp 00000000 00:15 2885598                            /home/peter/src/SO/bss-no-segfault
Size:                  4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
Referenced:            4 kB
Anonymous:             4 kB
...
[vvar] and [vdso] pages exported by the kernel for fast gettimeofday / getpid

Key things: rwxp 表示读/写/执行,私有.甚至在第一条指令之前就停止了,不知何故它已经脏"(即写入).文本段也是如此,但这是 gdb 将指令更改为 int3 所期望的.

Key things: rwxp means read/write/execute, and private. Even stopped before the first instruction, somehow it's already "dirty" (i.e. written to). So is the text segment, but that's expected from gdb changing the instruction to int3.

08049000-0804a000(以及 4 kB 大小的映射)向我们展示了 BSS 仅映射了 1 个页面.没有数据段,只有文本和 BSS.

The 08049000-0804a000 (and 4 kB size of the mapping) shows us that the BSS only has 1 page mapped. There's no data segment, just text and BSS.

这篇关于为什么存储超过 BSS 末尾时没有出现分段错误?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆