用户级线程是如何调度/创建的,内核级线程是如何创建的? [英] How are user-level threads scheduled/created, and how are kernel level threads created?
问题描述
如果这个问题很愚蠢,请道歉.我试图在网上找到答案已经有一段时间了,但找不到,因此我在这里问.我正在学习线程,我一直在浏览这个链接和
Apologies if this question is stupid. I tried to find an answer online for quite some time, but couldn't and hence I'm asking here. I am learning threads, and I've been going through this link and this Linux Plumbers Conference 2013 videoabout kernel level and user level threads, and as far as I understood, using pthreads create threads in the userspace, and the kernel is not aware about this and view it as a single process only, unaware of how many threads are inside. In such a case,
- 在进程获得的时间片内谁决定这些用户线程的调度,因为内核将其视为单个进程并且不知道线程,调度是如何完成的?
- 如果 pthread 创建用户级线程,如果需要,如何从用户空间程序创建内核级或操作系统线程?
- 根据上面的链接,它说操作系统内核提供系统调用来创建和管理线程.那么
clone()
系统调用是创建内核级线程还是用户级线程?- 如果它创建一个内核级线程,那么
strace
一个简单的pthreads程序也显示在执行时使用了clone(),但是为什么它会被认为是用户级线程? - 如果它不创建内核级线程,那么如何从用户空间程序创建内核线程?
- who decides the scheduling of these user threads during the timeslice the process gets, as the kernel sees it as a single process and is unaware of the threads, and how is the scheduling done?
- If pthreads create user level threads, how are kernel level or OS threads created from user space programs, if required?
- According to the above link, it says Operating Systems kernel provides system call to create and manage threads. So does a
clone()
system call creates a kernel level thread or user level thread?- If it creates a kernel level thread, then
strace
of a simple pthreads program also shows using clone() while executing, but then why would it be considered user level thread? - If it doesn't create a kernel level thread, then how are kernel threads created from userspace programs?
我问的是用户级线程创建,它是调度因为这里,有一个参考to Many to One Model,其中许多用户级线程映射到一个内核级线程,线程管理在用户空间由线程库完成.我只看到过使用 pthreads 的引用,但不确定它是创建用户级线程还是内核级线程.
I was asking about the user-level thread creation, and it's scheduling because here, there is a reference to Many to One Model where many user level threads are mapped to one Kernel-level thread, and Thread management is done in user space by the thread library. I've been only seeing references to using pthreads, but unsure if it creates user-level or kernel-level threads.
推荐答案
本文以顶部评论开头.
您正在阅读的文档是通用的 [不是特定于 Linux 的] 并且有点过时.而且,更重要的是,它使用了不同的术语.我相信,这就是混乱的根源.所以,请继续阅读...
The documentation you're reading is generic [not linux specific] and a bit outdated. And, more to the point, it is using different terminology. That is, I believe, the source of the confusion. So, read on ...
它所谓的用户级"线程就是我所说的 [过时] LWP 线程.它所谓的内核级"线程就是 Linux 中所谓的 native 线程.在 linux 下,所谓的内核"线程完全是另一回事 [见下文].
What it calls a "user-level" thread is what I'm calling an [outdated] LWP thread. What it calls a "kernel-level" thread is what is called a native thread in linux. Under linux, what is called a "kernel" thread is something else altogether [See below].
使用pthreads在用户空间创建线程,内核不知道这一点,只将其视为一个进程,不知道里面有多少线程.
using pthreads create threads in the userspace, and the kernel is not aware about this and view it as a single process only, unaware of how many threads are inside.
这是在
NPTL
(原生 posix 线程库)之前完成用户空间线程的方式.这也是 SunOS/Solaris 所说的LWP
轻量级进程.This was how userspace threads were done prior to the
NPTL
(native posix threads library). This is also what SunOS/Solaris called anLWP
lightweight process.有一个进程多路复用自身并创建线程.IIRC,它被称为线程主进程[或某些类似的].内核没有意识到这一点.内核不还理解或提供对线程的支持.
There was one process that multiplexed itself and created threads. IIRC, it was called the thread master process [or some such]. The kernel was not aware of this. The kernel didn't yet understand or provide support for threads.
但是,因为这些轻量级"线程是由基于用户空间的线程主控(又名轻量级进程调度程序")[只是一个特殊的用户程序/进程]中的代码切换的,所以它们切换上下文的速度非常慢.
But, because, these "lightweight" threads were switched by code in the userspace based thread master (aka "lightweight process scheduler") [just a special user program/process], they were very slow to switch context.
此外,在本机"线程出现之前,您可能有 10 个进程.每个进程获得 10% 的 CPU.如果其中一个进程是具有 10 个线程的 LWP,则这些线程必须共享那 10%,因此每个线程只能获得 1% 的 CPU.
Also, before the advent of "native" threads, you might have 10 processes. Each process gets 10% of the CPU. If one of the processes was an LWP that had 10 threads, these threads had to share that 10% and, thus, got only 1% of the CPU each.
所有这些都被内核调度程序知道的本机"线程所取代.这种转变是在 10-15 年前完成的.
All this was replaced by the "native" threads that the kernel's scheduler is aware of. This changeover was done 10-15 years ago.
现在,在上面的例子中,我们有 20 个线程/进程,每个线程/进程获得 5% 的 CPU.而且,上下文切换要快得多.
Now, with the above example, we have 20 threads/processes that each get 5% of the CPU. And, the context switch is much faster.
仍然可以在本机线程下拥有 LWP 系统,但现在,这是一种设计选择,而不是必需品.
It is still possible to have an LWP system under a native thread, but, now, that is a design choice, rather than a necessity.
此外,如果每个线程合作",LWP 会很好地工作.也就是说,每个线程循环周期性地对上下文切换"函数进行显式调用.它自愿放弃进程槽以便另一个 LWP 可以运行.
Further, LWP works great if each thread "cooperates". That is, each thread loop periodically makes an explicit call to a "context switch" function. It is voluntarily relinquishing the process slot so another LWP can run.
然而,
glibc
中的 pre-NPTL 实现也必须[强行]抢占 LWP 线程(即实现时间切片).我不记得使用的确切机制,但是,这里有一个例子.线程主必须设置闹钟,进入睡眠状态,唤醒然后向活动线程发送信号.信号处理程序会影响上下文切换.这很凌乱、丑陋,而且有点不可靠.However, the pre-NPTL implementation in
glibc
also had to [forcibly] preempt LWP threads (i.e. implement timeslicing). I can't remember the exact mechanism used, but, here's an example. The thread master had to set an alarm, go to sleep, wake up and then send the active thread a signal. The signal handler would effect the context switch. This was messy, ugly, and somewhat unreliable.Joachim 提到了
pthread_create
函数创建一个内核线程Joachim mentioned
pthread_create
function creates a kernel thread将其称为一个内核线程[技术上]是不正确的.
pthread_create
创建一个 native 线程.它在用户空间中运行,并在与进程平等的基础上争夺时间片.一旦创建,线程和进程之间几乎没有区别.That is [technically] incorrect to call it a kernel thread.
pthread_create
creates a native thread. This is run in userspace and vies for timeslices on an equal footing with processes. Once created there is little difference between a thread and a process.主要区别在于进程有自己唯一的地址空间.然而,线程是与属于同一线程组的其他进程/线程共享其地址空间的进程.
The primary difference is that a process has its own unique address space. A thread, however, is a process that shares its address space with other process/threads that are part of the same thread group.
如果它不创建内核级线程,那么如何从用户空间程序创建内核线程?
If it doesn't create a kernel level thread, then how are kernel threads created from userspace programs?
内核线程不是用户空间线程、NPTL、本机线程或其他线程.它们由内核通过
kernel_thread
函数创建.它们作为内核的一部分运行,并且不与任何用户空间程序/进程/线程相关联.他们可以完全访问机器.设备、MMU 等内核线程运行在最高权限级别:ring 0.它们也在内核的地址空间中运行,不是任何用户进程/线程的地址空间.Kernel threads are not userspace threads, NPTL, native, or otherwise. They are created by the kernel via the
kernel_thread
function. They run as part of the kernel and are not associated with any userspace program/process/thread. They have full access to the machine. Devices, MMU, etc. Kernel threads run in the highest privilege level: ring 0. They also run in the kernel's address space and not the address space of any user process/thread.用户空间程序/进程可能不创建内核线程.请记住,它使用
pthread_create
创建一个 native 线程,该线程调用clone
系统调用来执行此操作.A userspace program/process may not create a kernel thread. Remember, it creates a native thread using
pthread_create
, which invokes theclone
syscall to do so.线程对于做事情很有用,即使对于内核也是如此.所以,它在不同的线程中运行它的一些代码.您可以通过执行
ps ax
来查看这些线程.看看你会看到kthreadd、ksoftirqd、kworker、rcu_sched、rcu_bh、watchdog、migration
等.这些是内核线程,不是程序/进程.Threads are useful to do things, even for the kernel. So, it runs some of its code in various threads. You can see these threads by doing
ps ax
. Look and you'll seekthreadd, ksoftirqd, kworker, rcu_sched, rcu_bh, watchdog, migration
, etc. These are kernel threads and not programs/processes.更新:
你提到内核不知道用户线程.
You mentioned that kernel doesn't know about user threads.
请记住,如上所述,有两个时代".
Remember that, as mentioned above, there are two "eras".
(1) 在内核获得线程支持之前(大约 2004 年?).这使用了线程主控(在这里,我将其称为 LWP 调度程序).内核只有
fork
系统调用.(1) Before the kernel got thread support (circa 2004?). This used the thread master (which, here, I'll call the LWP scheduler). The kernel just had the
fork
syscall.(2) 之后的所有内核,do 理解线程.没有线程主,但是,我们有
pthreads
和clone
系统调用.现在,fork
被实现为clone
.clone
与fork
类似,但需要一些参数.值得注意的是,一个flags
参数和一个child_stack
参数.(2) All kernels after that which do understand threads. There is no thread master, but, we have
pthreads
and theclone
syscall. Now,fork
is implemented asclone
.clone
is similar tofork
but takes some arguments. Notably, aflags
argument and achild_stack
argument.更多关于这个下面...
More on this below ...
那么,用户级线程怎么可能拥有单独的堆栈?
then, how is it possible for user level threads to have individual stacks?
处理器堆栈没有什么魔法".我会将讨论 [主要] 限制在 x86 上,但这适用于任何架构,甚至那些甚至没有堆栈寄存器的架构(例如 1970 年代的 IBM 大型机,例如 IBM System 370)
There is nothing "magic" about a processor stack. I'll confine discussion [mostly] to x86, but this would be applicable to any architecture, even those that don't even have a stack register (e.g. 1970's era IBM mainframes, such as the IBM System 370)
在 x86 下,堆栈指针是
%rsp
.x86 有push
和pop
指令.我们使用这些来保存和恢复东西:push %rcx
和 [later]pop %rcx
.Under x86, the stack pointer is
%rsp
. The x86 haspush
andpop
instructions. We use these to save and restore things:push %rcx
and [later]pop %rcx
.但是,假设 x86 没有有
%rsp
或push/pop
指令?我们还能有一个堆栈吗?当然,按照约定.我们 [作为程序员] 同意(例如)%rbx
是堆栈指针.But, suppose the x86 did not have
%rsp
orpush/pop
instructions? Could we still have a stack? Sure, by convention. We [as programmers] agree that (e.g.)%rbx
is the stack pointer.在那种情况下,
%rcx
的推送"将是[使用 AT&T 汇编程序]:In that case, a "push" of
%rcx
would be [using AT&T assembler]:subq $8,%rbx movq %rcx,0(%rbx)
而且,
%rcx
的弹出"将是:And, a "pop" of
%rcx
would be:movq 0(%rbx),%rcx addq $8,%rbx
为了方便起见,我将切换到 C 语言伪代码".下面是上面的push/pop伪代码:
To make it easier, I'm going to switch to C "pseudo code". Here are the above push/pop in pseudo code:
// push %ecx %rbx -= 8; 0(%rbx) = %ecx; // pop %ecx %ecx = 0(%rbx); %rbx += 8;
<小时>
要创建线程,LWP 调度程序必须使用
malloc
创建一个堆栈区域.然后它必须将此指针保存在每个线程的结构中,然后启动子 LWP.实际代码有点棘手,假设我们有一个(例如)LWP_create
函数类似于pthread_create
:
To create a thread, the LWP scheduler had to create a stack area using
malloc
. It then had to save this pointer in a per-thread struct, and then kick off the child LWP. The actual code is a bit tricky, assume we have an (e.g.)LWP_create
function that is similar topthread_create
:typedef void * (*LWP_func)(void *); // per-thread control typedef struct tsk tsk_t; struct tsk { tsk_t *tsk_next; // tsk_t *tsk_prev; // void *tsk_stack; // stack base u64 tsk_regsave[16]; }; // list of tasks typedef struct tsklist tsklist_t; struct tsklist { tsk_t *tsk_next; // tsk_t *tsk_prev; // }; tsklist_t tsklist; // list of tasks tsk_t *tskcur; // current thread // LWP_switch -- switch from one task to another void LWP_switch(tsk_t *to) { // NOTE: we use (i.e.) burn register values as we do our work. in a real // implementation, we'd have to push/pop these in a special way. so, just // pretend that we do that ... // save all registers into tskcur->tsk_regsave tskcur->tsk_regsave[RAX] = %rax; // ... tskcur = to; // restore most registers from tskcur->tsk_regsave %rax = tskcur->tsk_regsave[RAX]; // ... // set stack pointer to new task's stack %rsp = tskcur->tsk_regsave[RSP]; // set resume address for task push(%rsp,tskcur->tsk_regsave[RIP]); // issue "ret" instruction ret(); } // LWP_create -- start a new LWP tsk_t * LWP_create(LWP_func start_routine,void *arg) { tsk_t *tsknew; // get per-thread struct for new task tsknew = calloc(1,sizeof(tsk_t)); append_to_tsklist(tsknew); // get new task's stack tsknew->tsk_stack = malloc(0x100000) tsknew->tsk_regsave[RSP] = tsknew->tsk_stack; // give task its argument tsknew->tsk_regsave[RDI] = arg; // switch to new task LWP_switch(tsknew); return tsknew; } // LWP_destroy -- destroy an LWP void LWP_destroy(tsk_t *tsk) { // free the task's stack free(tsk->tsk_stack); remove_from_tsklist(tsk); // free per-thread struct for dead task free(tsk); }
<小时>
对于理解线程的内核,我们使用
pthread_create
和clone
,但我们仍然必须创建新线程的堆栈.内核不会为新线程创建/分配堆栈.clone
系统调用接受一个child_stack
参数.因此,pthread_create
必须为新线程分配一个堆栈并将其传递给clone
:
With a kernel that understands threads, we use
pthread_create
andclone
, but we still have to create the new thread's stack. The kernel does not create/assign a stack for a new thread. Theclone
syscall accepts achild_stack
argument. Thus,pthread_create
must allocate a stack for the new thread and pass that toclone
:// pthread_create -- start a new native thread tsk_t * pthread_create(LWP_func start_routine,void *arg) { tsk_t *tsknew; // get per-thread struct for new task tsknew = calloc(1,sizeof(tsk_t)); append_to_tsklist(tsknew); // get new task's stack tsknew->tsk_stack = malloc(0x100000) // start up thread clone(start_routine,tsknew->tsk_stack,CLONE_THREAD,arg); return tsknew; } // pthread_join -- destroy an LWP void pthread_join(tsk_t *tsk) { // wait for thread to die ... // free the task's stack free(tsk->tsk_stack); remove_from_tsklist(tsk); // free per-thread struct for dead task free(tsk); }
<小时>
只有一个进程或主线程被内核分配了它的初始堆栈,通常在一个高内存地址.所以,如果进程不使用线程,通常,它只使用预先分配的堆栈.
Only a process or main thread is assigned its initial stack by the kernel, usually at a high memory address. So, if the process does not use threads, normally, it just uses that pre-assigned stack.
但是,如果创建了一个线程, LWP 或本机,启动进程/线程必须使用 <代码>malloc.旁注: 使用
malloc
是正常方式,但线程创建者可能只有一个大的全局内存池:char stack_area[MAXTASK][0x100000];
如果它想那样做.But, if a thread is created, either an LWP or a native one, the starting process/thread must pre-allocate the area for the proposed thread with
malloc
. Side note: Usingmalloc
is the normal way, but the thread creator could just have a large pool of global memory:char stack_area[MAXTASK][0x100000];
if it wished to do it that way.如果我们有一个不使用线程[任何类型]的普通程序,它可能希望覆盖"给定的默认堆栈.
If we had an ordinary program that does not use threads [of any type], it may wish to "override" the default stack it has been given.
如果它正在执行一个巨大的递归函数,该进程可能会决定使用
malloc
和上述汇编技巧来创建一个更大的堆栈.That process could decide to use
malloc
and the above assembler trickery to create a much larger stack if it were doing a hugely recursive function.在这里查看我的答案:用户定义栈和内置栈在内存使用上有什么区别?
这篇关于用户级线程是如何调度/创建的,内核级线程是如何创建的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
- If it creates a kernel level thread, then
- 如果它创建一个内核级线程,那么