为什么x86-64 Linux系统调用会修改RCX,该值是什么意思? [英] Why do x86-64 Linux system calls modify RCX, and what does the value mean?
问题描述
我正在尝试使用sys_brk
syscall在linux中分配一些内存.这是我尝试过的:
I'm trying to allocate some memory in linux with sys_brk
syscall. Here is what I tried:
BYTES_TO_ALLOCATE equ 0x08
section .text
global _start
_start:
mov rax, 12
mov rdi, BYTES_TO_ALLOCATE
syscall
mov rax, 60
syscall
这是根据linux调用约定,我希望返回值在rax
寄存器中(指向分配的内存的指针).我在gdb中运行了此命令,并在进行sys_brk
syscall之后,发现了以下寄存器内容
The thing is as per linux calling convention I expected the return value to be in rax
register (pointer to the allocated memory). I ran this in gdb and after making sys_brk
syscall I noticed the following register contents
在系统调用之前
rax 0xc 12
rbx 0x0 0
rcx 0x0 0
rdx 0x0 0
rsi 0x0 0
rdi 0x8 8
系统调用后
rax 0x401000 4198400
rbx 0x0 0
rcx 0x40008c 4194444 ; <---- What does this value mean?
rdx 0x0 0
rsi 0x0 0
rdi 0x8 8
在这种情况下,我不太了解rcx
寄存器中的值.哪个指针用作我用sys_brk
分配的8个字节的开头的指针?
I do not quite understand the value in the rcx
register in this case. Which one to use as a pointer to the beginning of 8 bytes I allocated with sys_brk
?
推荐答案
系统调用返回值始终位于rax
中.请参阅是UNIX&的调用约定Linux系统在i386和x86-64上调用.
The system call return value is in rax
, as always. See What are the calling conventions for UNIX & Linux system calls on i386 and x86-64.
请注意,sys_brk
与POSIX函数的接口略有不同.请参见Linux brk(2)
手册页的 C库/内核差异部分.具体来说, Linux sys_brk
设置程序中断; arg和返回值都是指针.请参见组装x86 brk()调用使用.该答案需要投票,因为它是该问题上唯一的好答案.
Note that sys_brk
has a slightly different interface than the brk
/ sbrk
POSIX functions; see the C library/kernel differences section of the Linux brk(2)
man page. Specifically, Linux sys_brk
sets the program break; the arg and return value are both pointers. See Assembly x86 brk() call use. That answer needs upvotes because it's the only good one on that question.
您的问题的另一个有趣的部分是:
The other interesting part of your question is:
在这种情况下,我不太了解rcx寄存器中的值
I do not quite understand the value in the rcx register in this case
您正在了解 syscall
/sysret
指令旨在允许内核恢复用户空间执行,但仍然快点.
You're seeing the mechanics of how the syscall
/ sysret
instructions are designed to allow the kernel to resume user-space execution but still be fast.
syscall
不执行任何加载或存储操作,它仅修改寄存器.与其使用特殊的寄存器来保存返回地址,不如使用常规的整数寄存器.
syscall
doesn't do any loads or stores, it only modifies registers. Instead of using special registers to save a return address, it simply uses regular integer registers.
内核返回您的用户空间代码后,RCX=RIP
和R11=RFLAGS
并非巧合.这种 not 唯一的情况是,如果ptrace
系统调用在内核内部修改了进程的已保存rcx
或r11
值,则该调用被修改. (ptrace
是gdb使用的系统调用).在那种情况下,Linux将使用iret
而不是sysret
返回用户空间,因为较慢的一般情况iret
可以做到这一点. (请参阅
It's not a coincidence that RCX=RIP
and R11=RFLAGS
after the kernel returns to your user-space code. The only way for this not to be the case is if a ptrace
system call modified the process's saved rcx
or r11
value while it was inside the kernel. (ptrace
is the system call gdb uses). In that case, Linux would use iret
instead of sysret
to return to user space, because the slower general-case iret
can do that. (See What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code? for some walk-through of Linux's system-call entry points. Mostly the entry points from 32-bit processes, not from syscall
in a 64-bit process, though.)
syscall
:不是将返回地址压入内核堆栈(就像int 0x80
一样),
- 设置RCX = RIP,R11 = RFLAGS(因此内核在执行
syscall
之前甚至无法看到这些reg的原始值). -
使用来自配置寄存器(
IA32_FMASK
MSR)的预配置掩码对RFLAGS
进行掩码.这样,内核就可以禁用中断(IF),直到完成swapgs
并设置rsp
使其指向内核堆栈为止.即使在入口点以cli
作为第一条指令,仍然存在漏洞窗口.您也可以通过屏蔽DF
来免费获得cld
,因此即使用户空间使用了std
,rep movs
/stos
也会向上移动.
- sets RCX=RIP, R11=RFLAGS (so it's impossible for the kernel to even see the original values of those regs before you executed
syscall
). masks
RFLAGS
with a pre-configured mask from a config register (theIA32_FMASK
MSR). This lets the kernel disable interrupts (IF) until it's doneswapgs
and settingrsp
to point to the kernel stack. Even withcli
as the first instruction at the entry point, there'd be a window of vulnerability. You also getcld
for free by masking offDF
sorep movs
/stos
go upward even if user-space had usedstd
.
有趣的事实:AMD首次提出的syscall
/swapgs
设计并没有掩盖RFLAGS,但是)(大约在2000年之前)第一硅).
Fun fact: AMD's first proposed syscall
/ swapgs
design didn't mask RFLAGS, but they changed it after feedback from kernel developers on the amd64 mailing list (in ~2000, a couple years before the first silicon).
跳转到已配置的syscall
入口点(设置CS:RIP = IA32_LSTAR
).我认为旧的CS
值不会保存在任何地方.
jumps to the configured syscall
entry point (setting CS:RIP = IA32_LSTAR
). The old CS
value isn't saved anywhere, I think.
它没有做任何其他事情,内核必须使用swapgs
来访问保存了内核堆栈指针的信息块,因为rsp
仍然具有来自用户空间的值.
It doesn't do anything else, the kernel has to use swapgs
to get access to an info block where it saved the kernel stack pointer, because rsp
still has its value from user-space.
因此,syscall
的设计要求系统调用ABI来填充寄存器,这就是为什么值是实际值的原因.
So the design of syscall
requires a system-call ABI that clobbers registers, and that's why the values are what they are.
这篇关于为什么x86-64 Linux系统调用会修改RCX,该值是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!