调用克隆系统调用时,谁设置RIP寄存器? [英] Who sets the RIP register when you call the clone syscall?

查看:82
本文介绍了调用克隆系统调用时,谁设置RIP寄存器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试实现最小内核,并且正在尝试实现克隆syscall.在手册页中,您可以看到如下定义的克隆系统调用:

I am trying to implement a minimal kernel and I am trying to implement the clone syscall. In the man pages you can see the clone syscall defined as such:

int clone(int (*fn)(void *), void *stack, int flags, void *arg, ...
                 /* pid_t *parent_tid, void *tls, pid_t *child_tid */ );

如您所见,它接收一个函数指针.如果您仔细阅读手册页,您实际上会发现内核中实际的syscall实现没有收到函数指针:

As you can see, it receives a function pointer. If you read the man page more closely you can actually see that the actual syscall implementation in the kernel does not receive a function pointer:

long clone(unsigned long flags, void *stack,
                      int *parent_tid, int *child_tid,
                      unsigned long tls);

所以,我的问题是,谁在创建线程后修改RIP寄存器?是libc吗?

So, my question is, who modifies the RIP register after a thread is created? Is it the libc?

我在glibc中找到了以下代码:

I found this code in glibc: https://elixir.bootlin.com/glibc/latest/source/sysdeps/unix/sysv/linux/x86_64/clone.S but I am not sure at what point the function is actually called.

其他信息:

查看clone.S源代码时,您可以看到它在syscall之后跳转到thread_start分支.在克隆syscall之后的分支上(因此只有孩子可以这样做),它从堆栈中弹出函数地址和参数.谁实际将这些参数和函数地址压入堆栈?我猜想它必须发生在内核中的某个地方,因为在 syscall 指令的点上它们不存在.

When looking at the clone.S source code you can see that it jumps to a thread_start branch after the syscall. On the branch after the clone syscall (so only the child does this) it pops the function address and the arguments from the stack. Who actually pushed these arguments and the function address on the stack? I guess it has to happen somewhere in the kernel because at the point of the syscall instruction they were not there.

以下是一些gdb输出:

Here is some gdb output:

在系统调用之前:

[-------------------------------------code-------------------------------------]
   0x7ffff7d8af22 <clone+34>:   mov    r8,r9
   0x7ffff7d8af25 <clone+37>:   mov    r10,QWORD PTR [rsp+0x8]
   0x7ffff7d8af2a <clone+42>:   mov    eax,0x38
=> 0x7ffff7d8af2f <clone+47>:   syscall 
   0x7ffff7d8af31 <clone+49>:   test   rax,rax
   0x7ffff7d8af34 <clone+52>:   jl     0x7ffff7d8af49 <clone+73>
   0x7ffff7d8af36 <clone+54>:   je     0x7ffff7d8af39 <clone+57>
   0x7ffff7d8af38 <clone+56>:   ret
Guessed arguments:
arg[0]: 0x3d0f00 
arg[1]: 0x7ffff8020b60 --> 0x7ffff7d3fb30 (<do_something>:  push   rbx)
arg[2]: 0x7fffffffda90 --> 0x0 
[------------------------------------stack-------------------------------------]
0000| 0x7fffffffda78 --> 0x7ffff7d3f52c (<main+172>:    pop    rsi)
0008| 0x7fffffffda80 --> 0x7fffffffda94 --> 0x73658b0000000000 
0016| 0x7fffffffda88 --> 0x7fffffffda94 --> 0x73658b0000000000 
0024| 0x7fffffffda90 --> 0x0 
0032| 0x7fffffffda98 --> 0x492e085573658b00 
0040| 0x7fffffffdaa0 --> 0x7ffff7d3f0d0 (<_init>:   sub    rsp,0x8)
0048| 0x7fffffffdaa8 --> 0x7ffff7d40830 (<__libc_csu_init>: push   r15)
0056| 0x7fffffffdab0 --> 0x7ffff7d408d0 (<__libc_csu_fini>: push   rbp)
[------------------------------------------------------------------------------]

在子线程上执行syscall指令后(检查堆栈的顶部-在父线程上不会发生):

After the syscall instruction on the child thread (check the top of the stack - this does not happen on the parent's thread):

[-------------------------------------code-------------------------------------]
   0x7ffff7d8af25 <clone+37>:   mov    r10,QWORD PTR [rsp+0x8]
   0x7ffff7d8af2a <clone+42>:   mov    eax,0x38
   0x7ffff7d8af2f <clone+47>:   syscall 
=> 0x7ffff7d8af31 <clone+49>:   test   rax,rax
   0x7ffff7d8af34 <clone+52>:   jl     0x7ffff7d8af49 <clone+73>
   0x7ffff7d8af36 <clone+54>:   je     0x7ffff7d8af39 <clone+57>
   0x7ffff7d8af38 <clone+56>:   ret    
   0x7ffff7d8af39 <clone+57>:   xor    ebp,ebp
[------------------------------------stack-------------------------------------]
0000| 0x7ffff8020b60 --> 0x7ffff7d3fb30 (<do_something>:    push   rbx)
0008| 0x7ffff8020b68 --> 0x7ffff7dd5add --> 0x4c414d0074736574 ('test')
0016| 0x7ffff8020b70 --> 0x0 
0024| 0x7ffff8020b78 --> 0x411 
0032| 0x7ffff8020b80 ("Parameters: 0x7ffff7d3fb30 4001536 0x7ffff8020b70 0x7fffffffda90 0x7ffff8000b60 0x7fffffffda94\n")
0040| 0x7ffff8020b88 ("rs: 0x7ffff7d3fb30 4001536 0x7ffff8020b70 0x7fffffffda90 0x7ffff8000b60 0x7fffffffda94\n")
0048| 0x7ffff8020b90 ("fff7d3fb30 4001536 0x7ffff8020b70 0x7fffffffda90 0x7ffff8000b60 0x7fffffffda94\n")
0056| 0x7ffff8020b98 ("30 4001536 0x7ffff8020b70 0x7fffffffda90 0x7ffff8000b60 0x7fffffffda94\n")
[------------------------------------------------------------------------------]

推荐答案

是的,libc;内核接口就像 fork :它两次返回同一位置,但返回值不同.(子代中的 0 或父代中的PID/TID).手册页记录了glibc包装程序和内核差异,就像其他有差异的系统调用一样.

Yes, libc; the kernel interface is like fork: it returns twice to the same place, but with different return values. (0 in the child or a PID/TID in the parent). The man page documents the glibc wrapper vs. kernel differences, like for other system calls where there's a difference.

libc包装器将函数指针和您所传递的arg存储在新线程的堆栈空间中,新线程可以在该堆栈空间中对其进行加载.(内核将其RSP设置为传递给 clone() void * stack arg来启动它,因此它无法访问堆栈存储器或寄存器中的旧本地变量.,并且如果多个线程同时克隆自己,使用全局变量将不是线程安全的.)

The libc wrapper stashes the function pointer and arg you pass in the new thread's stack space, where the new thread can load it. (The kernel starts it with its RSP set to the void *stack arg passed to clone(), so it doesn't have access to old locals in stack memory or registers, and using a global wouldn't be thread-safe if multiple threads are cloning themselves at the same time.)

请注意,还有一个 clone3 系统调用,它带有一个struct arg,也更像是 clone 的原始内核接口.(或者至少没有glibc包装器.)

Note that there's also a clone3 system call that takes a struct arg, and is also more like the raw kernel interface for clone. (Or at least there is no glibc wrapper for it.)

这篇关于调用克隆系统调用时,谁设置RIP寄存器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆