如何在不链接 libc.so 的情况下访问段寄存器? [英] How to access segment register with out linking libc.so?
问题描述
我正在尝试在 Ubuntu 20.10 上使用 NASM 版本 2.15.04 在 64 位程序集中编写一个简单的堆栈金丝雀.使用命令 nasm -felf64 canary.asm && 进行组装和链接时,执行以下代码会导致分段错误ld canary.o
.
I am attempting to code a simple stack canary in 64bit assembly using NASM version 2.15.04 on Ubuntu 20.10. Executing the code below results in a segmentation fault when assembling and linking with the command nasm -felf64 canary.asm && ld canary.o
.
global _start
section .text
_start: endbr64
push rbp ; Save base pointer
mov rbp, rsp ; Set the stack pointer
call _func ; Call _func
mov rdi, rax ; Save return value of _func in RDI
mov rax, 0x3c ; Specify exit syscall
syscall ; Exit
_func: endbr64
push rbp ; Save the base pointer
mov rbp, rsp ; Set the stack pointer
sub rsp, 0x8 ; Adjust the stack pointer
mov rax, qword fs:[0x28] ; Get stack canary
mov qword [rbp - 0x8], rax ; Save stack canary on the stack
xor eax, eax ; Clear RAX
mov rax, 0x1 ; Specify write syscall
mov rdi, 0x1 ; Specify stdout
mov rsi, msg ; Char* buffer to print
mov rdx, 0xd ; Length of the buffer
syscall ; Write msg
mov rax, qword [rbp - 0x8] ; Retrieve the stack canary
xor rax, qword fs:[0x28] ; Compare to original value
je _return ; Jump to _return if canary matched original
xor eax, eax ; Clear RAX
mov rax, 0x1 ; Specify write syscall
mov rdi, 0x1 ; Specify stdout
mov rsi, stack_fail ; Char* buffer to print
mov rdx, 0x18 ; Length of the buffer
syscall ; Write stack_fail
mov rax, 0x3c ; Specify exit syscall
mov rax, 0x1 ; Specify error code 1
syscall ; Exit
_return: xor eax, eax ; Set return value to 0
add rsp, 0x8 ; Reset stack pointer
pop rbp ; Get original base pointer
ret ; Return
section .data
msg: db "Hello, World", 0xa, 0x0
stack_fail db "Stack smashing detected", 0xa, 0x0
使用 GDB 调试显示分段错误发生在第 16 行:mov rax, qword fs:[0x28]
.
Debugging with GDB shows that the segmentation fault happens on line 16: mov rax, qword fs:[0x28]
.
─────────────────────────────────────────────────────────────────────────────────── code:x86:64 ────
0x40101b <_func+4> push rbp
0x40101c <_func+5> mov rbp, rsp
0x40101f <_func+8> sub rsp, 0x8
→ 0x401023 <_func+12> mov rax, QWORD PTR fs:0x28
0x40102c <_func+21> mov QWORD PTR [rbp-0x8], rax
0x401030 <_func+25> xor eax, eax
0x401032 <_func+27> mov eax, 0x1
0x401037 <_func+32> mov edi, 0x1
0x40103c <_func+37> movabs rsi, 0x402000
─────────────────────────────────────────────────────────────────────────────────────── threads ────
[#0] Id 1, Name: "a.out", stopped 0x401023 in _func (), reason: SIGSEGV
但是通过 nasm -felf64 canary.asm && 组装和动态链接 libcld canary.o -lc -dynamic-linker/usr/lib64/ld-linux-x86-64.so.2
导致执行成功,不再导致分段错误.
However assembling and dynamically linking with libc via nasm -felf64 canary.asm && ld canary.o -lc -dynamic-linker /usr/lib64/ld-linux-x86-64.so.2
results in execution succeeding, no longer causing a segmentation fault.
使用 Radare2 比较最终的二进制文件表明,两个版本都以相同的方式组装了问题指令:
Using Radare2 to compare the final binaries shows that both versions assembled the problem instruction identically as:
0x00401023 64488b042528. mov rax, qword fs:[0x28]
这两种情况下的 GDB 还显示,在执行该指令时,FS 寄存器为 0x0000.
GDB in both cases also shows that the FS register is 0x0000 at execution time for that instruction.
因此,无论二进制文件是否与 libc 链接,并且代码没有使用 libc 的外部符号,指令字节和 FS 寄存器都是相同的.为什么链接 libc 会导致执行成功,而不链接 libc 会导致分段错误?是否有可能和/或如何在不链接 libc 的情况下实现它?
So the instruction bytes and the FS register are identical whether or not the binary is linked with libc and the code has no use of external symbols from libc. Why is it that linking with libc causes execution to succeed while not linking libc causes a segmentation fault? Is it possible and/or how would I implement this without linking libc?
注意:此示例中堆栈金丝雀的相关性或需求不是问题的重点.
推荐答案
访问段寄存器没问题,mov eax, fs
即可.但是您要做的是在距 FS 段 base 的一个小偏移量处访问线程本地存储,libc init 内容将要求内核进行设置.
Accessing a segment register is no problem, just mov eax, fs
. But what you're trying to do is access thread-local storage at a small offset from the FS segment base, which libc init stuff will have asked the kernel to set up.
最简单的方法是使用普通的 RIP 相对寻址模式访问您的堆栈金丝雀,而不是相对于 FS 基础,就像 GCC 在针对其他 ISA 时所做的那样.只有当你想让其他一些漏洞更难到达金丝雀(并且它的地址可以单独随机化)时,你才需要 TLS.(或者这样库代码可以访问它,而无需从 GOT 间接加载指针,而不是仅对主可执行文件中的代码有效.)
The simplest thing would be to just access your stack canary with a normal RIP-relative addressing mode, not relative to FS base, like GCC will do when targeting other ISAs. Only if you want to make it harder for some other exploit to reach the canary (and for its address to be separately randomizable) do you need TLS. (Or so library code can access it without the indirection of loading a pointer from the GOT, instead of only being efficient for code in the main executable.)
如果您想复制 GCC 的堆栈金丝雀代码,当然可以进行与 libc 相同的系统调用来设置线程本地存储并使用它.
You can of course make the same system calls libc does to set up thread-local storage and use it, if you want to copy GCC's stack-canary code.
有趣的事实:sub rax, qword fs:[0x28]
是一种比 XOR 更有效的检查金丝雀的方法 - 它可以将 JCC 宏融合到单个微指令中.这就是为什么当前的 GCC 改为使用 sub
.https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90568 - 已在 GCC10+ 中修复.
Fun fact: sub rax, qword fs:[0x28]
is a more efficient way to check the canary than XOR - it can macro-fuse with the JCC into a single uop. That's why current GCC changed to using sub
. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90568 - fixed in GCC10+.
我的 GCC 错误报告实际上包含了自包含的微基准代码(以证明 sub
即使使用 FS: 寻址模式也可以进行宏融合).
My GCC bug report actually included self-contained microbenchmark code (to prove that sub
can macro-fuse even with an FS: addressing mode).
在静态可执行文件中如果没有 libc,它会设置 FS 段,因此其基地址是缓冲区的地址,因此 [fs: 0x28]
将起作用.这是 TLS 的基本形式.
Without libc in a static executable, it sets up the FS segment so its base address is the address of a buffer so [fs: 0x28]
will work. This is a basic form of TLS.
global _start
_start:
cookie equ 12345
mov eax, 158 ; __NR_arch_prctl
mov edi, 0x1002 ; ARCH_SET_FS
lea rsi, [buf]
syscall
mov qword [fs: 0x28], cookie
...
section .bss
buf: resb 4096 ; fs.base will point at this buffer
如果内核启用 wrfsbase
以供用户空间使用,您可以使用 wrfsbase rsi
而不是进行系统调用.我认为最新的 Linux 内核(5.10)可能已经开始使用 wrfsbase
本身,但我不知道它是否允许用户空间使用它.
If the kernel enabled wrfsbase
for user-space use, you could use wrfsbase rsi
instead of making a system call. I think the most recent Linux kernel (5.10) maybe has started using wrfsbase
itself, but I don't know if it enables user-space use of it.
(它可能不会在每次使用它时打开/关闭 FSGSBASE,因此内核使用意味着用户空间可以使用它;故障条件 手册中没有提到权限级别,只有CPUID特性位和CR4控制寄存器中的位.仅在 64 位模式下;它会在其他模式下(包括兼容模式)#UD.)
(It probably doesn't toggle FSGSBASE on/off every time it uses it, so kernel usage would mean user-space can use it; the fault conditions in the manual don't mention privilege level, only the CPUID feature bit and a bit in the CR4 control register. And only in 64-bit mode; it will #UD in other modes including compat mode.)
这篇关于如何在不链接 libc.so 的情况下访问段寄存器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!