的panic()函数是否会完全冻结所有其他进程? [英] does kernel's panic() function completely freezes every other process?

查看:149
本文介绍了的panic()函数是否会完全冻结所有其他进程?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想确认一下,内核的panic()函数以及诸如kernel_halt()machine_halt()之类的其他函数一旦触发,就可以保证机器完全冻结.

I would like to be confirmed that kernel's panic() function and the others like kernel_halt() and machine_halt(), once triggered, guarantee complete freezing of the machine.

那么,所有内核和用户进程是否都冻结了?调度程序可以中断panic()吗?中断处理程序是否仍然可以执行?

So, are all the kernel and user processes frozen? Is panic() interruptible by the scheduler? The interrupt handlers could still be executed?

用例:如果发生严重错误,我需要确保硬件看门狗会重置计算机.为此,我需要确保没有其他线程/进程使监视程序保持活动状态.我需要触发系统完全停止.当前,在内核模块内部,我只需调用panic()冻结所有内容.

Use case: in case of serious error, I need to be sure that the hardware watchdog resets the machine. To this end, I need to make sure that no other thread/process is keeping the watchdog alive. I need to trigger a complete halt of the system. Currently, inside my kernel module, I simply call panic() to freeze everything.

还可以确保用户空间halt命令冻结系统吗?

Also, the user-space halt command is guaranteed to freeze the system?

谢谢.

edit:根据: http://linux.die.net/man/2/reboot ,我认为最好的方法是使用reboot(LINUX_REBOOT_CMD_HALT):如果有的话,将控制权交给ROM监视器"

edit: According to: http://linux.die.net/man/2/reboot, I think the best way is to use reboot(LINUX_REBOOT_CMD_HALT): "Control is given to the ROM monitor, if there is one"

推荐答案

感谢您的上述评论.经过研究,我准备在下面给自己一个更完整的答案:

Thank you for the comments above. After some research, I am ready to give myself a more complete answer, below:

至少对于x86架构,reboot(LINUX_REBOOT_CMD_HALT)是可行的方式.依次调用syscall reboot()(请参阅: http ://lxr.linux.no/linux+v3.6.6/kernel/sys.c#L433 ).然后,对于LINUX_REBOOT_CMD_HALT标志(请参阅: http://lxr.linux.no/linux+v3.6.6/kernel/sys.c#L480 ),系统调用将调用kernel_halt()(在此处定义:machine_halt(),这是native_machine_halt()的包装(请参见: http://lxr.linux.no/linux + v3.6.6/arch/x86/kernel/reboot.c#L680 ).该功能可以停止其他CPU(通过machine_shutdown()),然后调用stop_this_cpu()以禁用最后剩余的工作处理器.此功能要做的第一件事是禁用当前处理器上的中断,即调度程序不再具有控制权.

At least for the x86 architecture, the reboot(LINUX_REBOOT_CMD_HALT) is the way to go. This, in turn, calls the syscall reboot() (see: http://lxr.linux.no/linux+v3.6.6/kernel/sys.c#L433). Then, for the LINUX_REBOOT_CMD_HALT flag (see: http://lxr.linux.no/linux+v3.6.6/kernel/sys.c#L480), the syscall calls kernel_halt() (defined here: http://lxr.linux.no/linux+v3.6.6/kernel/sys.c#L394). That function calls syscore_shutdown() to execute all the registered system core shutdown callbacks, displays the "System halted" message, then it dumps the kernel, AND, finally, it calls machine_halt(), that is a wrapper for native_machine_halt() (see: http://lxr.linux.no/linux+v3.6.6/arch/x86/kernel/reboot.c#L680). It is this function that stops the other CPUs (through machine_shutdown()), then calls stop_this_cpu() to disable the last remaining working processor. The first thing that this function does is to disable interrupts on the current processor, that is the scheduler is no more able to take control.

我不确定为什么在调用kernel_halt()之后系统调用reboot()仍然调用do_exit(0).我这样解释:现在,在所有处理器都标记为禁用的情况下,系统调用reboot()调用do_exit(0)并结束自身.即使调度程序被唤醒,也不再有可用于调度某些任务或中断的启用的处理器:系统停止运行.我不确定这个解释,因为stop_this_cpu()似乎不会返回(它进入无限循环).对于stop_this_cpu()失败(并返回)的情况,也许只是一种保护措施:在这种情况下,do_exit()会干净地结束当前任务,然后调用panic()函数.

I am not sure why the syscall reboot() still calls do_exit(0), after calling kernel_halt(). I interpret it like that: now, with all processors marked as disabled, the syscall reboot() calls do_exit(0) and ends itself. Even if the scheduler is awoken, there are no more enabled processors on which it could schedule some task, nor interrupt: the system is halted. I am not sure about this explanation, as the stop_this_cpu() seems to not return (it enters an infinite loop). Maybe is just a safeguard, for the case when the stop_this_cpu() fails (and returns): in this case, do_exit() will end cleanly the current task, then the panic() function is called.

关于panic()代码(在此处定义: http ://lxr.linux.no/linux+v3.6.6/kernel/panic.c#L69 ),该函数首先禁用本地中断,然后禁用所有其他处理器(当前处理器除外),方法是调用smp_send_stop().最后,作为在当前处理器(这是唯一仍在运行的处理器)上执行的唯一任务,禁用所有本地中断(也就是说,可抢占的调度程序(毕竟是计时器中断)没有任何机会……) ,然后panic()函数循环运行一段时间或调用emergency_restart(),这应该重新启动处理器.

As for the panic() code (defined here: http://lxr.linux.no/linux+v3.6.6/kernel/panic.c#L69), the function first disables the local interrupts, then it disables all the other processors, except the current one by calling smp_send_stop(). Finally, as the sole task executing on the current processor (which is the only processor still alive), with all local interrupts disabled (that is, the preemptible scheduler -- a timer interrupt, after all -- has no chance...), then the panic() function loops some time or it calls emergency_restart(), that is supposed to restart the processor.

如果您有更好的见解,请贡献力量.

If you have better insight, please contribute.

这篇关于的panic()函数是否会完全冻结所有其他进程?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆