Linux内核中的schedule()+ switch_to()函数实际上如何工作? [英] How does schedule()+switch_to() functions from linux kernel actually work?
问题描述
我试图了解linux内核中的调度过程实际上是如何工作的.我的问题与调度算法无关.关于功能schedule()
和switch_to()
的工作方式.
I'm trying to understand how the schedule process in linux kernel actually works. My question is not about the scheduling algorithm. Its about how the functions schedule()
and switch_to()
work.
我会尝试解释.我看到了:
I'll try to explain. I saw that:
当进程用完时间片时,标志need_resched
由scheduler_tick()
设置.内核检查该标志,看是否已设置该标志,并调用schedule()
(与问题1相关)以切换到新进程.此标志是一条消息,应尽快调用调度,因为应该运行另一个进程.
返回用户空间或从中断返回后,将检查need_resched
标志.如果已设置,内核将在继续之前调用调度程序.
When a process runs out of time-slice, the flag need_resched
is set by scheduler_tick()
. The kernel checks the flag, sees that it is set, and calls schedule()
(pertinent to question 1) to switch to a new process. This flag is a message that schedule should be invoked as soon as possible because another process deserves to run.
Upon returning to user-space or returning from an interrupt, the need_resched
flag is checked. If it is set, the kernel invokes the scheduler before continuing.
查看内核源代码(linux-2.6.10-《 Linux内核开发,第二版》所基于的版本),我还看到一些代码可以自动调用schedule()
函数,从而给出另一个过程运行权.
我看到函数switch_to()
是实际执行上下文切换的函数.我研究了一些与体系结构相关的代码,试图了解switch_to()
的实际作用.
Looking into the kernel source (linux-2.6.10 - version that the book "Linux Kernel Development, second edition" is based on), I also saw that some codes can call the schedule()
function voluntarily, giving another process the right to run.
I saw that the function switch_to()
is the one that actually does the context switch. I looked into some architecture dependent codes, trying to understand what switch_to()
was actually doing.
这种行为引起了一些我找不到答案的问题:
That behavior raised some questions that I could not find the answers for :
-
switch_to()
完成时,当前正在运行的进程是什么?那个叫schedule()
的进程?还是下一个过程,就是选择运行的那个过程?
When
switch_to()
finishes, what is the current running process? The process that calledschedule()
? Or the next process, the one that was picked to run?
当schedule()
被中断调用时,选定的要运行的进程在中断处理完成时(在某种RTE之后)开始运行.还是在那之前?
When schedule()
gets called by an interrupt, the selected process to run starts to run when the interrupt handling finishes (after some kind of RTE) ? Or before that?
如果无法从中断调用schedule()
函数,则何时设置标志-need_resched
?
If the schedule()
function can not be called from an interrupt, when is the flag- need_resched
set?
当计时器中断处理程序工作时,正在使用什么堆栈?
When the timer interrupt handler is working, what stack is being used?
我不知道我能否说清楚.如果我做不到,我希望我能在回答(或提出问题)后再做. 我已经查看了一些试图理解该过程的资料.我有一本书"Linux内核开发,第二版",我也正在使用它. 如果可以帮助解释,我对MIP和H8300架构有些了解.
I don't know if I could make myself clear. If I couldn't, I hope I can do this after some answers (or questions). I already looked at several sources trying to understand that process. I have the book "Linux Kernel Development, sec ed", and I'm using it too. I know a bit about MIPs and H8300 architecture, if that help to explain.
推荐答案
- 调用
switch_to()
,则将内核堆栈切换到next
中命名的任务的堆栈.更改地址空间等在例如context_switch()
中进行处理. -
schedule()
不能在原子上下文中调用,包括从中断中调用(请参见中断返回路径. - 请参阅2.
- 我相信,使用默认的8K堆栈,可以使用当前正在执行的任何内核堆栈来处理中断.如果使用4K堆栈,我相信会有一个单独的中断堆栈(由于使用了x86魔术,该堆栈会自动加载),但是我对此并不完全确定.
- After calling
switch_to()
, the kernel stack is switched to that of the task named innext
. Changing the address space, etc, is handled in egcontext_switch()
. schedule()
cannot be called in atomic context, including from an interrupt (see the check inschedule_debug()
). If a reschedule is needed, the TIF_NEED_RESCHED task flag is set, which is checked in the interrupt return path.- See 2.
- I believe that, with the default 8K stacks, Interrupts are handled with whatever kernel stack is currently executing. If 4K stacks are used, I believe there's a separate interrupt stack (automatically loaded thanks to some x86 magic), but I'm not completely certain on that point.
更详细一点,这是一个实际示例:
To be a bit more detailed, here's a practical example:
- 发生中断. CPU切换到中断蹦床程序,该程序将中断号压入堆栈,然后jmps到 do_IRQ ,其中禁用抢占,然后 set_task_need_resched 进行设置TIF_NEED_RESCHED任务标志.
- 最终,CPU在原始中断中从do_IRQ返回,并进入检查是否设置了TIF_NEED_RESCHED ,如果已设置,则调用 retint_careful ,它既检查挂起的重新计划(并在需要时直接调用
schedule()
),也检查挂起的信号,然后在还原GS并从中返回中断处理程序.
- An interrupt occurs. The CPU switches to an interrupt trampoline routine, which pushes the interrupt number onto the stack, then jmps to common_interrupt
- common_interrupt calls do_IRQ, which disables preemption then handles the IRQ
- At some point, a decision is made to switch tasks. This may be from the timer interrupt, or from a wakeup call. In either case, set_task_need_resched is invoked, setting the TIF_NEED_RESCHED task flag.
- Eventually, the CPU returns from do_IRQ in the original interrupt, and proceeds to the IRQ exit path. If this IRQ was invoked from within the kernel, it checks whether TIF_NEED_RESCHED is set, and if so calls preempt_schedule_irq, which briefly enables interrupts while performing a
schedule()
. - If the IRQ was invoked from userspace, we first check whether there's anything that needs doing prior to returning. If so, we go to retint_careful, which checks both for a pending reschedule (and directly invokes
schedule()
if needed) as well as checking for pending signals, then goes back for another round atretint_check
until there's no more important flags set. - Finally, we restore GS and return from the interrupt handler.
至于switch_to()
; switch_to()
(在x86-32上)的作用是:
As for switch_to()
; what switch_to()
(on x86-32) does is:
- 保存EIP(指令指针)和ESP(堆栈指针)的当前值,以备日后返回该任务时使用.
- 切换
current_task
的值.此时,current
现在指向新任务. - 切换到新堆栈,然后将我们要切换到的任务保存的EIP推送到堆栈上.稍后,将使用此EIP作为返回地址执行返回;这就是它跳回到以前称为
switch_to()
的旧代码的方式.
- 致电 __switch_to() .此时,
current
指向新任务,并且我们在新任务的堆栈上,但是其他各种CPU状态尚未更新.__switch_to()
处理FPU,段描述符,调试寄存器等事物的状态切换. - 从
__switch_to()
返回时,将switch_to()
手动推入堆栈的返回地址返回到该地址,将执行放回到新任务中switch_to()
之前的位置.现在,已完全执行切换到任务的操作.
- Save the current values of EIP (instruction pointer) and ESP (stack pointer) for when we return to this task at some point later.
- Switch the value of
current_task
. At this point,current
now points to the new task. - Switch to the new stack, then push the EIP saved by the task we're switching to onto the stack. Later, a return will be performed, using this EIP as the return address; this is how it jumps back to the old code that previously called
switch_to()
- Call __switch_to(). At this point,
current
points to the new task, and we're on the new task's stack, but various other CPU state hasn't been updated.__switch_to()
handles switching the state of things like the FPU, segment descriptors, debug registers, etc. - Upon return from
__switch_to()
, the return address thatswitch_to()
manually pushed onto the stack is returned to, placing execution back where it was prior to theswitch_to()
in the new task. Execution has now fully resumed on the switched-to task.
x86-64非常相似,但是由于ABI不同,它不得不做更多的状态保存/恢复工作.
x86-64 is very similar, but has to do slightly more saving/restoration of state due to the different ABI.
这篇关于Linux内核中的schedule()+ switch_to()函数实际上如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!