unix 信号如何工作? [英] How do unix signals work?

查看:26
本文介绍了unix 信号如何工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

信号在 unix 中是如何工作的?我通过了 W.R. Stevens 但无法理解.请帮帮我.

解决方案

下面的解释并不准确,它的工作原理在不同系统之间的几个方面有所不同(对于某些部分,甚至可能是不同硬件上的相同操作系统),但我认为通常 足以满足你的好奇心来使用它们.大多数人开始在编程中使用信号甚至没有达到这种理解水平,但在我习惯使用它们之前,我想了解它们.

信号传递

OS 内核有一个数据结构,称为进程控制块,用于每个正在运行的进程,其中包含有关该进程的数据.这可以通过进程 ID (PID) 进行查找,并包含一个信号操作和未决信号表.

当一个信号被发送到一个进程时,操作系统内核将查找该进程的进程控制块并检查信号动作表来定位正在发送的特定信号的动作.如果信号动作值是SIG_IGN,那么内核会忘记新信号.如果信号动作值为 SIG_DFL,则内核在另一个表中查找该信号的默认信号处理动作并执行该动作.如果这些值是其他任何值,那么它被假定为信号被发送到的进程内的函数地址,应该调用它.SIG_IGNSIG_DFL 的值是转换为函数指针的数字,其值不是进程地址空间中的有效地址(例如 0 和 1,它们都在第 0 页中),永远不会映射到进程中).

如果进程注册了一个信号处理函数(信号操作值既不是 SIG_IGN 也不是 SIG_DFL),那么挂起的信号表中会为该信号创建一个条目,并且该进程被标记为准备运行(它可能有一直在等待某些事情,例如数据可用于调用 read、等待信号或其他一些事情).

现在该进程下次运行时,操作系统内核将首先向堆栈添加一些数据并更改该进程的指令指针,使其看起来几乎就像进程本身刚刚调用了信号处理程序.这并不完全正确,实际上与实际发生的情况有很大的偏差,我稍后会详细讨论.

信号处理函数可以做任何它做的事情(它是代表它被调用的进程的一部分,所以它是在知道程序应该用那个信号做什么的情况下编写的).当信号处理程序返回时,进程的常规代码将再次开始执行.(同样,不准确,但接下来会详细介绍)

好的,以上应该已经让您对信号如何传递给进程有了一个很好的了解.我认为需要这个 漂亮 idea 版本才能掌握完整的想法,其中包括一些更复杂的东西.

操作系统内核经常需要知道信号处理程序何时返回.这是因为信号处理程序接受一个参数(这可能需要堆栈空间),您可以在信号处理程序执行期间阻止相同的信号被传递两次,和/或在信号传递后重新启动系统调用.要做到这一点,不仅仅是堆栈和指令指针的变化.

必须发生的事情是内核需要让进程告诉它它已经完成了信号处理函数的执行.这可以通过将 RAM 的一部分映射到进程的地址空间来完成,该地址空间包含进行此系统调用的代码,并使信号处理函数的返回地址(该函数开始运行时堆栈上的顶部值)为这段代码.我认为这就是在 Linux(至少是较新版本)中的做法.实现此目的的另一种方法(我不知道这是否已完成,但有可能)是将信号处理程序函数的返回地址设为无效地址(例如 NULL),这将导致大多数系统中断,这将再次赋予操作系统内核控制权.这种情况如何发生并不重要,但内核必须再次获得控制权以修复堆栈并知道信号处理程序已完成.

在研究我学到的另一个问题时

Linux 内核确实为此将页面映射到进程中,但是用于注册信号处理程序的实际系统调用(sigaction 调用)采用参数 sa_restore 参数,这是一个地址应该用作信号处理程序的返回地址,内核只是确保它放在那里.此地址处的代码发出Im done 系统调用 (sigreturn),内核知道信号处理程序已完成.>

信号生成

我主要假设您首先知道信号是如何生成的.由于某些事情发生,操作系统可以代表进程生成它们,例如计时器到期、子进程死亡、访问不应访问的内存或发出不应访问的指令(不存在的指令)或特权),或许多其他事情.定时器情况在功能上与其他情况略有不同,因为它可能在进程未运行时发生,因此更像是通过 kill 系统调用发送的信号.对于代表当前进程发送的非定时器相关信号,这些是在中断发生时生成的,因为当前进程做错了什么.这个中断赋予内核控制权(就像系统调用一样),内核生成要传递给当前进程的信号.

How do signals work in unix? I went through W.R. Stevens but was unable to understand. Please help me.

解决方案

The explanation below is not exact, and several aspects of how this works differ between different systems (and maybe even the same OS on different hardware for some portions), but I think that it is generally good enough for you to satisfy your curiosity enough to use them. Most people start using signals in programming without even this level of understanding, but before I got comfortable using them I wanted to understand them.

signal delivery

The OS kernel has a data structure called a process control block for each process running which has data about that process. This can be looked up by the process id (PID) and included a table of signal actions and pending signals.

When a signal is sent to a process the OS kernel will look up that process's process control block and examines the signal action table to locate the action for the particular signal being sent. If the signal action value is SIG_IGN then the new signal is forgotten about by the kernel. If the signal action value is SIG_DFL then the kernel looks up the default signal handling action for that signal in another table and preforms that action. If the values are anything else then that is assumed to be a function address within the process that the signal is being sent to which should be called. The values for SIG_IGN and SIG_DFL are numbers cast to function pointers whose values are not valid addresses within a process's address space (such as 0 and 1, which are both in page 0, which is never mapped into a process).

If a signal handling function were registered by the process (the signal action value was neither SIG_IGN or SIG_DFL) then an entry in the pending signal table is made for that signal and that process is marked as ready to RUN (it may have been waiting on something, like data to become available for a call to read, waiting for a signal, or several other things).

Now the next time that the process is run the OS kernel will first add some data to the stack and changes the instruction pointer for that process so that it looks almost like the process itself has just called the signal handler. This is not entirely correct and actually deviates enough from what actually happens that I'll talk about it more in a little bit.

The signal handler function can do whatever it does (it is part of the process that it was called on behalf of, so it was written with knowledge about what that program should do with that signal). When the signal handler returns then the regular code for the process begins executing again. (again, not accurate, but more on that next)

Ok, the above should have given you a pretty good idea of how signals are delivered to a process. I think that this pretty good idea version is needed before you can grasp the full idea, which includes some more complicated stuff.

Very often the OS kernel needs to know when a signal handler returns. This is because signal handlers take an argument (which may require stack space), you can block the same signal from being delivered twice during the execution of the signal handler, and/or have system calls restarted after a signal is delivered. To accomplish this a little bit more than stack and instruction pointer changes.

What has to happen is that the kernel needs to make the process tell it that it has finished executing the signal handler function. This may be done by mapping a section of RAM into the process's address space which contains code to make this system call and making the return address for the signal handler function (the top value on the stack when this function started running) be the address of this code. I think that this is how it is done in Linux (at least newer versions). Another way to accomplish this (I don't know if this is done, but it could be) would be do make the return address for the signal handler function be an invalid address (such as NULL) which would cause an interrupt on most systems, which would give the OS kernel control again. It doesn't matter a whole lot how this happens, but the kernel has to get control again to fix up the stack and know that the signal handler has completed.

WHILE LOOKING INTO ANOTHER QUESTION I LEARNED

that the Linux kernel does map a page into the process for this, but that the actual system call for registering signal handlers (what sigaction calls ) takes a parameter sa_restore parameter, which is an address that should be used as the return address from the signal handler, and the kernel just makes sure that it is put there. The code at this address issues the I'm done system call (sigreturn)and the kernel knows that the signal handler has finished.

signal generation

I'm mostly assuming that you know how signals are generated in the first place. The OS can generate them on behalf of a process due to something happening, like a timer expiring, a child process dying, accessing memory that it should not be accessing, or issuing an instruction that it should not (either an instruction that does not exist or one that is privileged), or many other things. The timer case is functionally a little different from the others because it may occur when the process is not running, and so is more like the signals sent with the kill system call. For the non-timer related signals sent on behalf of the current process these are generated when an interrupt occurs because the current process is doing something wrong. This interrupt gives the kernel control (just like a system call) and the kernel generates the signal to be delivered to the current process.

这篇关于unix 信号如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆