用户级线程是如何调度/创建的,内核级线程是如何创建的? [英] How are user-level threads scheduled/created, and how are kernel level threads created?

查看:14
本文介绍了用户级线程是如何调度/创建的,内核级线程是如何创建的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果这个问题很愚蠢,请道歉.我试图在网上找到答案已经有一段时间了,但找不到,因此我在这里问.我正在学习线程,我一直在浏览这个链接这个 Linux Plumbers Conference 2013 视频关于内核级和用户级线程,据我所知,使用 pthreads 创建线程用户空间和内核不知道这一点,只将其视为单个进程,不知道内部有多少线程.在这种情况下,

Apologies if this question is stupid. I tried to find an answer online for quite some time, but couldn't and hence I'm asking here. I am learning threads, and I've been going through this link and this Linux Plumbers Conference 2013 videoabout kernel level and user level threads, and as far as I understood, using pthreads create threads in the userspace, and the kernel is not aware about this and view it as a single process only, unaware of how many threads are inside. In such a case,

  • 在进程获得的时间片内谁决定这些用户线程的调度,因为内核将其视为单个进程并且不知道线程,调度是如何完成的?
  • 如果 pthread 创建用户级线程,如果需要,如何从用户空间程序创建内核级或操作系统线程?
  • 根据上面的链接,它说操作系统内核提供系统调用来创建和管理线程.那么 clone() 系统调用是创建内核级线程还是用户级线程?
    • 如果它创建一个内核级线程,那么strace 一个简单的pthreads程序也显示在执行时使用了clone(),但是为什么它会被认为是用户级线程?
    • 如果它不创建内核级线程,那么如何从用户空间程序创建内核线程?
    • who decides the scheduling of these user threads during the timeslice the process gets, as the kernel sees it as a single process and is unaware of the threads, and how is the scheduling done?
    • If pthreads create user level threads, how are kernel level or OS threads created from user space programs, if required?
    • According to the above link, it says Operating Systems kernel provides system call to create and manage threads. So does a clone() system call creates a kernel level thread or user level thread?
      • If it creates a kernel level thread, then strace of a simple pthreads program also shows using clone() while executing, but then why would it be considered user level thread?
      • If it doesn't create a kernel level thread, then how are kernel threads created from userspace programs?

      我问的是用户级线程创建,它是调度因为这里,有一个参考to Many to One Model,其中许多用户级线程映射到一个内核级线程,线程管理在用户空间由线程库完成.我只看到过使用 pthreads 的引用,但不确定它是创建用户级线程还是内核级线程.

      I was asking about the user-level thread creation, and it's scheduling because here, there is a reference to Many to One Model where many user level threads are mapped to one Kernel-level thread, and Thread management is done in user space by the thread library. I've been only seeing references to using pthreads, but unsure if it creates user-level or kernel-level threads.

      推荐答案

      本文以顶部评论开头.

      您正在阅读的文档是通用的 [不是特定于 Linux 的] 并且有点过时.而且,更重要的是,它使用了不同的术语.我相信,这就是混乱的根源.所以,请继续阅读...

      The documentation you're reading is generic [not linux specific] and a bit outdated. And, more to the point, it is using different terminology. That is, I believe, the source of the confusion. So, read on ...

      它所谓的用户级"线程就是我所说的 [过时] LWP 线程.它所谓的内核级"线程就是 Linux 中所谓的 native 线程.在 linux 下,所谓的内核"线程完全是另一回事 [见下文].

      What it calls a "user-level" thread is what I'm calling an [outdated] LWP thread. What it calls a "kernel-level" thread is what is called a native thread in linux. Under linux, what is called a "kernel" thread is something else altogether [See below].

      使用pthreads在用户空间创建线程,内核不知道这一点,只将其视为一个进程,不知道里面有多少线程.

      using pthreads create threads in the userspace, and the kernel is not aware about this and view it as a single process only, unaware of how many threads are inside.

      这是在 NPTL(原生 posix 线程库)之前完成用户空间线程的方式.这也是 SunOS/Solaris 所说的 LWP 轻量级进程.

      This was how userspace threads were done prior to the NPTL (native posix threads library). This is also what SunOS/Solaris called an LWP lightweight process.

      有一个进程多路复用自身并创建线程.IIRC,它被称为线程主进程[或某些类似的].内核没有意识到这一点.内核不理解或提供对线程的支持.

      There was one process that multiplexed itself and created threads. IIRC, it was called the thread master process [or some such]. The kernel was not aware of this. The kernel didn't yet understand or provide support for threads.

      但是,因为这些轻量级"线程是由基于用户空间的线程主控(又名轻量级进程调度程序")[只是一个特殊的用户程序/进程]中的代码切换的,所以它们切换上下文的速度非常慢.

      But, because, these "lightweight" threads were switched by code in the userspace based thread master (aka "lightweight process scheduler") [just a special user program/process], they were very slow to switch context.

      此外,在本机"线程出现之前,您可能有 10 个进程.每个进程获得 10% 的 CPU.如果其中一个进程是具有 10 个线程的 LWP,则这些线程必须共享那 10%,因此每个线程只能获得 1% 的 CPU.

      Also, before the advent of "native" threads, you might have 10 processes. Each process gets 10% of the CPU. If one of the processes was an LWP that had 10 threads, these threads had to share that 10% and, thus, got only 1% of the CPU each.

      所有这些都被内核调度程序知道的本机"线程所取代.这种转变是在 10-15 年前完成的.

      All this was replaced by the "native" threads that the kernel's scheduler is aware of. This changeover was done 10-15 years ago.

      现在,在上面的例子中,我们有 20 个线程/进程,每个线程/进程获得 5% 的 CPU.而且,上下文切换要快得多.

      Now, with the above example, we have 20 threads/processes that each get 5% of the CPU. And, the context switch is much faster.

      仍然可以在本机线程下拥有 LWP 系统,但现在,这是一种设计选择,而不是必需品.

      It is still possible to have an LWP system under a native thread, but, now, that is a design choice, rather than a necessity.

      此外,如果每个线程合作",LWP 会很好地工作.也就是说,每个线程循环周期性地对上下文切换"函数进行显式调用.它自愿放弃进程槽以便另一个 LWP 可以运行.

      Further, LWP works great if each thread "cooperates". That is, each thread loop periodically makes an explicit call to a "context switch" function. It is voluntarily relinquishing the process slot so another LWP can run.

      然而,glibc 中的 pre-NPTL 实现也必须[强行]抢占 LWP 线程(即实现时间切片).我不记得使用的确切机制,但是,这里有一个例子.线程主必须设置闹钟,进入睡眠状态,唤醒然后向活动线程发送信号.信号处理程序会影响上下文切换.这很凌乱、丑陋,而且有点不可靠.

      However, the pre-NPTL implementation in glibc also had to [forcibly] preempt LWP threads (i.e. implement timeslicing). I can't remember the exact mechanism used, but, here's an example. The thread master had to set an alarm, go to sleep, wake up and then send the active thread a signal. The signal handler would effect the context switch. This was messy, ugly, and somewhat unreliable.

      Joachim 提到了 pthread_create 函数创建一个内核线程

      Joachim mentioned pthread_create function creates a kernel thread

      将其称为一个内核线程[技术上]是不正确的.pthread_create 创建一个 native 线程.它在用户空间中运行,并在与进程平等的基础上争夺时间片.一旦创建,线程和进程之间几乎没有区别.

      That is [technically] incorrect to call it a kernel thread. pthread_create creates a native thread. This is run in userspace and vies for timeslices on an equal footing with processes. Once created there is little difference between a thread and a process.

      主要区别在于进程有自己唯一的地址空间.然而,线程是与属于同一线程组的其他进程/线程共享其地址空间的进程.

      The primary difference is that a process has its own unique address space. A thread, however, is a process that shares its address space with other process/threads that are part of the same thread group.

      如果它不创建内核级线程,那么如何从用户空间程序创建内核线程?

      If it doesn't create a kernel level thread, then how are kernel threads created from userspace programs?

      内核线程不是用户空间线程、NPTL、本机线程或其他线程.它们由内核通过 kernel_thread 函数创建.它们作为内核的一部分运行,并且与任何用户空间程序/进程/线程相关联.他们可以完全访问机器.设备、MMU 等内核线程运行在最高权限级别:ring 0.它们也在内核的地址空间中运行,不是任何用户进程/线程的地址空间.

      Kernel threads are not userspace threads, NPTL, native, or otherwise. They are created by the kernel via the kernel_thread function. They run as part of the kernel and are not associated with any userspace program/process/thread. They have full access to the machine. Devices, MMU, etc. Kernel threads run in the highest privilege level: ring 0. They also run in the kernel's address space and not the address space of any user process/thread.

      用户空间程序/进程可能创建内核线程.请记住,它使用 pthread_create 创建一个 native 线程,该线程调用 clone 系统调用来执行此操作.

      A userspace program/process may not create a kernel thread. Remember, it creates a native thread using pthread_create, which invokes the clone syscall to do so.

      线程对于做事情很有用,即使对于内核也是如此.所以,它在不同的线程中运行它的一些代码.您可以通过执行 ps ax 来查看这些线程.看看你会看到 kthreadd、ksoftirqd、kworker、rcu_sched、rcu_bh、watchdog、migration 等.这些是内核线程,不是程序/进程.

      Threads are useful to do things, even for the kernel. So, it runs some of its code in various threads. You can see these threads by doing ps ax. Look and you'll see kthreadd, ksoftirqd, kworker, rcu_sched, rcu_bh, watchdog, migration, etc. These are kernel threads and not programs/processes.

      更新:

      你提到内核不知道用户线程.

      You mentioned that kernel doesn't know about user threads.

      请记住,如上所述,有两个时代".

      Remember that, as mentioned above, there are two "eras".

      (1) 在内核获得线程支持之前(大约 2004 年?).这使用了线程主控(在这里,我将其称为 LWP 调度程序).内核只有 fork 系统调用.

      (1) Before the kernel got thread support (circa 2004?). This used the thread master (which, here, I'll call the LWP scheduler). The kernel just had the fork syscall.

      (2) 之后的所有内核,do 理解线程.没有线程主,但是,我们有pthreadsclone系统调用.现在,fork 被实现为 clone.clonefork 类似,但需要一些参数.值得注意的是,一个 flags 参数和一个 child_stack 参数.

      (2) All kernels after that which do understand threads. There is no thread master, but, we have pthreads and the clone syscall. Now, fork is implemented as clone. clone is similar to fork but takes some arguments. Notably, a flags argument and a child_stack argument.

      更多关于这个下面...

      More on this below ...

      那么,用户级线程怎么可能拥有单独的堆栈?

      then, how is it possible for user level threads to have individual stacks?

      处理器堆栈没有什么魔法".我会将讨论 [主要] 限制在 x86 上,但这适用于任何架构,甚至那些甚至没有堆栈寄存器的架构(例如 1970 年代的 IBM 大型机,例如 IBM System 370)

      There is nothing "magic" about a processor stack. I'll confine discussion [mostly] to x86, but this would be applicable to any architecture, even those that don't even have a stack register (e.g. 1970's era IBM mainframes, such as the IBM System 370)

      在 x86 下,堆栈指针是 %rsp.x86 有 pushpop 指令.我们使用这些来保存和恢复东西:push %rcx 和 [later] pop %rcx.

      Under x86, the stack pointer is %rsp. The x86 has push and pop instructions. We use these to save and restore things: push %rcx and [later] pop %rcx.

      但是,假设 x86 没有%rsppush/pop 指令?我们还能有一个堆栈吗?当然,按照约定.我们 [作为程序员] 同意(例如)%rbx 是堆栈指针.

      But, suppose the x86 did not have %rsp or push/pop instructions? Could we still have a stack? Sure, by convention. We [as programmers] agree that (e.g.) %rbx is the stack pointer.

      在那种情况下,%rcx 的推送"将是[使用 AT&T 汇编程序]:

      In that case, a "push" of %rcx would be [using AT&T assembler]:

      subq    $8,%rbx
      movq    %rcx,0(%rbx)
      

      而且,%rcx 的弹出"将是:

      And, a "pop" of %rcx would be:

      movq    0(%rbx),%rcx
      addq    $8,%rbx
      

      为了方便起见,我将切换到 C 语言伪代码".下面是上面的push/pop伪代码:

      To make it easier, I'm going to switch to C "pseudo code". Here are the above push/pop in pseudo code:

      // push %ecx
          %rbx -= 8;
          0(%rbx) = %ecx;
      
      // pop %ecx
          %ecx = 0(%rbx);
          %rbx += 8;
      

      <小时>

      要创建线程,LWP 调度程序必须使用 malloc 创建一个堆栈区域.然后它必须将此指针保存在每个线程的结构中,然后启动子 LWP.实际代码有点棘手,假设我们有一个(例如)LWP_create 函数类似于 pthread_create:


      To create a thread, the LWP scheduler had to create a stack area using malloc. It then had to save this pointer in a per-thread struct, and then kick off the child LWP. The actual code is a bit tricky, assume we have an (e.g.) LWP_create function that is similar to pthread_create:

      typedef void * (*LWP_func)(void *);
      
      // per-thread control
      typedef struct tsk tsk_t;
      struct tsk {
          tsk_t *tsk_next;                    //
          tsk_t *tsk_prev;                    //
          void *tsk_stack;                    // stack base
          u64 tsk_regsave[16];
      };
      
      // list of tasks
      typedef struct tsklist tsklist_t;
      struct tsklist {
          tsk_t *tsk_next;                    //
          tsk_t *tsk_prev;                    //
      };
      
      tsklist_t tsklist;                      // list of tasks
      
      tsk_t *tskcur;                          // current thread
      
      // LWP_switch -- switch from one task to another
      void
      LWP_switch(tsk_t *to)
      {
      
          // NOTE: we use (i.e.) burn register values as we do our work. in a real
          // implementation, we'd have to push/pop these in a special way. so, just
          // pretend that we do that ...
      
          // save all registers into tskcur->tsk_regsave
          tskcur->tsk_regsave[RAX] = %rax;
          // ...
      
          tskcur = to;
      
          // restore most registers from tskcur->tsk_regsave
          %rax = tskcur->tsk_regsave[RAX];
          // ...
      
          // set stack pointer to new task's stack
          %rsp = tskcur->tsk_regsave[RSP];
      
          // set resume address for task
          push(%rsp,tskcur->tsk_regsave[RIP]);
      
          // issue "ret" instruction
          ret();
      }
      
      // LWP_create -- start a new LWP
      tsk_t *
      LWP_create(LWP_func start_routine,void *arg)
      {
          tsk_t *tsknew;
      
          // get per-thread struct for new task
          tsknew = calloc(1,sizeof(tsk_t));
          append_to_tsklist(tsknew);
      
          // get new task's stack
          tsknew->tsk_stack = malloc(0x100000)
          tsknew->tsk_regsave[RSP] = tsknew->tsk_stack;
      
          // give task its argument
          tsknew->tsk_regsave[RDI] = arg;
      
          // switch to new task
          LWP_switch(tsknew);
      
          return tsknew;
      }
      
      // LWP_destroy -- destroy an LWP
      void
      LWP_destroy(tsk_t *tsk)
      {
      
          // free the task's stack
          free(tsk->tsk_stack);
      
          remove_from_tsklist(tsk);
      
          // free per-thread struct for dead task
          free(tsk);
      }
      

      <小时>

      对于理解线程的内核,我们使用pthread_createclone,但我们仍然必须创建新线程的堆栈.内核不会为新线程创建/分配堆栈.clone 系统调用接受一个 child_stack 参数.因此,pthread_create 必须为新线程分配一个堆栈并将其传递给 clone:


      With a kernel that understands threads, we use pthread_create and clone, but we still have to create the new thread's stack. The kernel does not create/assign a stack for a new thread. The clone syscall accepts a child_stack argument. Thus, pthread_create must allocate a stack for the new thread and pass that to clone:

      // pthread_create -- start a new native thread
      tsk_t *
      pthread_create(LWP_func start_routine,void *arg)
      {
          tsk_t *tsknew;
      
          // get per-thread struct for new task
          tsknew = calloc(1,sizeof(tsk_t));
          append_to_tsklist(tsknew);
      
          // get new task's stack
          tsknew->tsk_stack = malloc(0x100000)
      
          // start up thread
          clone(start_routine,tsknew->tsk_stack,CLONE_THREAD,arg);
      
          return tsknew;
      }
      
      // pthread_join -- destroy an LWP
      void
      pthread_join(tsk_t *tsk)
      {
      
          // wait for thread to die ...
      
          // free the task's stack
          free(tsk->tsk_stack);
      
          remove_from_tsklist(tsk);
      
          // free per-thread struct for dead task
          free(tsk);
      }
      

      <小时>

      只有一个进程或主线程被内核分配了它的初始堆栈,通常在一个高内存地址.所以,如果进程使用线程,通常,它只使用预先分配的堆栈.


      Only a process or main thread is assigned its initial stack by the kernel, usually at a high memory address. So, if the process does not use threads, normally, it just uses that pre-assigned stack.

      但是,如果创建了一个线程, LWP 或本机,启动进程/线程必须使用 <代码>malloc.旁注: 使用 malloc 是正常方式,但线程创建者可能只有一个大的全局内存池:char stack_area[MAXTASK][0x100000]; 如果它想那样做.

      But, if a thread is created, either an LWP or a native one, the starting process/thread must pre-allocate the area for the proposed thread with malloc. Side note: Using malloc is the normal way, but the thread creator could just have a large pool of global memory: char stack_area[MAXTASK][0x100000]; if it wished to do it that way.

      如果我们有一个使用线程[任何类型]的普通程序,它可能希望覆盖"给定的默认堆栈.

      If we had an ordinary program that does not use threads [of any type], it may wish to "override" the default stack it has been given.

      如果它正在执行一个巨大的递归函数,该进程可能会决定使用 malloc 和上述汇编技巧来创建一个更大的堆栈.

      That process could decide to use malloc and the above assembler trickery to create a much larger stack if it were doing a hugely recursive function.

      在这里查看我的答案:用户定义栈和内置栈在内存使用上有什么区别?

      这篇关于用户级线程是如何调度/创建的,内核级线程是如何创建的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆