使用SIGTERM在子进程上调用kill会终止父进程,但使用SIGKILL调用它会使父进程保持活动状态 [英] Calling kill on a child process with SIGTERM terminates parent process, but calling it with SIGKILL keeps the parent alive
问题描述
gcc -Wall -O2 example.c -o example
并使用例如
运行./example sqlite3
您会注意到, Ctrl + C 不会中断sqlite3
;但是,即使您直接运行sqlite3
,它也不会中断-;相反,您仅在屏幕上看到^C
.这是因为sqlite3
设置终端的方式是 Ctrl + C 不会产生信号,而只是解释为正常输入.
您可以使用.quit
命令从sqlite3
退出,或者在行首按 Ctrl + D .
您将看到原始程序在返回到命令行之前,将在之后输出Command ... []
行.因此,父进程不会被信号杀死/伤害/阻碍.
您可以使用ps f
查看终端进程的树,然后找出父进程和子进程的PID,并向其中一个发送信号以观察发生的情况.
请注意,由于无法捕获,阻止或忽略SIGSTOP
信号,因此反映作业控制信号将是很简单的(例如,当您使用 Ctrl + Z ).为了进行适当的作业控制,父进程将需要建立一个新的会话和一个进程组,并暂时与终端分离.这也是可能的,但超出了本文的范围,因为它涉及会话,进程组和终端的相当详细的行为,以便正确管理.
让我们解构上面的示例程序.
示例程序本身首先安装一些信号反射器,然后派生一个子进程,然后该子进程执行命令sqlite3
. (您可以为该程序添加任何可执行文件和任何参数字符串.)
internal_child_pid
变量以及set_child_pid()
和get_child_pid()
函数,用于从原子上管理子进程. __atomic_store_n()
和__atomic_load_n()
是编译器提供的内置函数;对于GCC,请请参见此处以获取详细信息.它们避免了仅部分分配子pid时发生信号的问题.在某些常见的体系结构上不会发生这种情况,但这只是一个仔细的示例,因此原子访问用于确保仅看到完全(旧的或新的)值.如果我们在过渡期间暂时屏蔽了相关信号,则可以避免完全使用这些信号.再次,我认为原子访问更简单,并且在实践中可能会很有趣.
forward_handler()
函数以原子方式获取子进程PID,然后验证它是否为非零(我们知道我们有一个子进程),并且我们没有转发子进程发送的信号(只是为了确保我们不这样做)不会引起信号风暴,两个信号相互轰炸). siginfo_t
结构中的各个字段在 man 2 sigaction
<中列出/a>手册页.
forward_signal()
函数为指定的信号signum
安装上述处理程序.请注意,我们首先使用memset()
将整个结构清除为零.如果将结构中的某些填充转换为数据字段,则以这种方式清除它可以确保将来的兼容性.
struct sigaction
中的.sa_mask
字段是无序的信号集.屏蔽中设置的信号被阻止在执行信号处理程序的线程中传递. (对于上面的示例程序,我们可以放心地说,这些信号在运行信号处理程序时被阻止;只是在多线程程序中,这些信号仅在用于运行处理程序的特定线程中被阻止.)>
使用sigemptyset(&act.sa_mask)
清除信号掩码很重要.简单地将结构设置为零是不够的,即使实际上在许多机器上也可以(可能)工作. (我不知道;我什至没有检查过.我更喜欢鲁棒和可靠,而不是懒惰和脆弱的一天!)
使用的标志包括SA_SIGINFO
,因为处理程序使用三参数形式(并使用siginfo_t
的si_pid
字段). SA_RESTART
标志仅存在于此,因为OP希望使用它.它只是意味着,如果可能的话,如果使用当前在系统调用中阻塞的线程(例如wait()
)传递信号,则C库和内核将尝试避免返回errno == EINTR
错误.您可以删除SA_RESTART
标志,并在父进程的循环中的适当位置添加调试fprintf(stderr, "Hey!\n");
,以查看随后会发生什么.
如果没有错误,sigaction()
函数将返回0,否则将设置errno
的-1
.如果成功分配了forward_handler
,则forward_signal()
函数将返回0,否则将返回非零errno号.有些人不喜欢这种返回值(他们更愿意为错误返回-1,而不是errno
值本身),但是出于某种不合理的原因,我喜欢这种惯用语.一定要更改它.
现在我们去main()
.
如果运行不带参数或带有单个-h
或--help
参数的程序,它将打印使用情况摘要.同样,我只是喜欢这种方式- getopt()
和 getopt_long()
更常见用于解析命令行选项.对于这种琐碎的程序,我只是对参数检查进行了硬编码.
在这种情况下,我故意使用法输出非常短.如果真的有一段有关程序 功能的附加段落,那就更好了.这些文本,尤其是代码中的注释(解释 intent ,即代码应该做什么的想法,而不是描述代码实际做什么的想法)非常重要.自从我第一次获得编写代码的报酬以来,已经过去了二十多年,而且我仍在学习如何注释-更好地描述我的代码的意图,所以我认为代码编写越早开始,更好.
应该熟悉fork()
部分.如果返回-1
,则分叉失败(可能是由于限制或类似原因),然后打印出errno
消息是一个很好的主意.返回值在子进程中为0
,在父进程中为子进程ID.
execlp()
函数采用两个参数:二进制文件的名称(在PATH环境变量中指定的目录将用于搜索这样的二进制文件),以及指向该二进制文件的参数的指针数组.第一个参数将是新二进制文件中的argv[0]
,即命令名称本身.
如果将execlp(argv[1], argv + 1);
调用与上面的描述进行比较,则实际上很容易解析. argv[1]
命名要执行的二进制文件. argv + 1
基本上等同于(char **)(&argv[1])
,即它是一个以argv[1]
而不是argv[0]
开头的指针数组.再一次,我只是喜欢execlp(argv[n], argv + n)
惯用语,因为它允许一个人执行在命令行上指定的另一条命令,而不必担心解析命令行或通过shell执行它(有时这是完全不受欢迎的) ).
man 7 signal
手册页介绍了发生了什么fork()
和exec()
处的信号处理程序.简而言之,信号处理程序是通过fork()
继承的,但是在exec()
处重置为默认值.幸运的是,这正是我们想要的.
如果我们先进行分叉,然后安装信号处理程序,则会有一个窗口,在该窗口中子进程已经存在,但父进程仍具有默认的信号处理(主要是终止).
相反,我们可以使用例如在分叉之前,请在父进程中 sigprocmask()
.阻塞信号意味着使其等待";直到信号被解除阻塞,它才会被传送.在子进程中,信号可能会保持阻塞状态,因为无论如何,信号处置都会通过exec()
重置为默认值.在父进程中,我们可以-或在派生之前无关紧要-安装信号处理程序,最后解除对信号的阻塞.这样,我们就不需要原子的东西,甚至不需要检查子pid是否为零,因为在传递任何信号之前,子pid将被设置为其实际值!
while
循环基本上只是围绕waitpid()
调用的循环,直到我们开始的确切子进程退出或发生有趣的事情(子进程以某种方式消失)为止.如果要在没有SA_RESTART
标志的情况下安装信号处理程序,则此循环包含非常仔细的错误检查以及正确的EINTR
处理.
如果我们派生的子进程退出了,我们将检查退出状态和/或退出的原因,并打印诊断消息为标准错误.
最后,程序以可怕的骇客结尾:当子进程退出时,我们返回用waitpid获得的整个状态字,而不是返回EXIT_SUCCESS
或EXIT_FAILURE
.我之所以将其保留下来,是因为在您想返回与子进程返回的相同或相似的退出状态代码时,有时会在实践中使用它.因此,仅用于说明.如果您发现自己的程序应该返回与其分叉并执行的子进程相同的退出状态,则这仍然比设置机器使进程以杀死该子进程的信号杀死自身更好.过程.如果需要使用它,只需在此处添加一个突出的注释,并在安装说明中添加注释,以便那些在可能不希望使用的体系结构上编译程序的人可以对其进行修复.
This is a continuation of How to prevent SIGINT in child process from propagating to and killing parent process?
In the above question, I learned that SIGINT
wasn't being bubbled up from child to parent, but rather, is issued to the entire foreground process group, meaning I needed to write a signal handler to prevent the parent from exiting when I hit CTRL + C
.
I tried to implement this, but here's the problem. Regarding specifically the kill
syscall I invoke to terminate the child, if I pass in SIGKILL
, everything works as expected, but if I pass in SIGTERM
, it also terminates the parent process, showing Terminated: 15
in the shell prompt later.
Even though SIGKILL works, I want to use SIGTERM is because it seems just like a better idea in general from what I've read about it giving the process it's signaling to terminate a chance to clean itself up.
The below code is a stripped down example of what I came up with
#include <stdio.h>
#include <signal.h>
#include <stdlib.h>
#include <unistd.h>
pid_t CHILD = 0;
void handle_sigint(int s) {
(void)s;
if (CHILD != 0) {
kill(CHILD, SIGTERM); // <-- SIGKILL works, but SIGTERM kills parent
CHILD = 0;
}
}
int main() {
// Set up signal handling
char str[2];
struct sigaction sa = {
.sa_flags = SA_RESTART,
.sa_handler = handle_sigint
};
sigaction(SIGINT, &sa, NULL);
for (;;) {
printf("1) Open SQLite\n"
"2) Quit\n"
"-> "
);
scanf("%1s", str);
if (str[0] == '1') {
CHILD = fork();
if (CHILD == 0) {
execlp("sqlite3", "sqlite3", NULL);
printf("exec failed\n");
} else {
wait(NULL);
printf("Hi\n");
}
} else if (str[0] == '2') {
break;
} else {
printf("Invalid!\n");
}
}
}
My educated guess as to why this is happening would be something intercepts the SIGTERM, and kills the entire process group. Whereas, when I use SIGKILL, it can't intercept the signal so my kill call works as expected. That's just a stab in the dark though.
Could someone explain why this is happening?
As I side note, I'm not thrilled with my handle_sigint
function. Is there a more standard way of killing an interactive child process?
You have too many bugs in your code (from not clearing the signal mask on the struct sigaction
) for anyone to explain the effects you are seeing.
Instead, consider the following working example code, say example.c
:
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <signal.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
/* Child process PID, and atomic functions to get and set it.
* Do not access the internal_child_pid, except using the set_ and get_ functions.
*/
static pid_t internal_child_pid = 0;
static inline void set_child_pid(pid_t p) { __atomic_store_n(&internal_child_pid, p, __ATOMIC_SEQ_CST); }
static inline pid_t get_child_pid(void) { return __atomic_load_n(&internal_child_pid, __ATOMIC_SEQ_CST); }
static void forward_handler(int signum, siginfo_t *info, void *context)
{
const pid_t target = get_child_pid();
if (target != 0 && info->si_pid != target)
kill(target, signum);
}
static int forward_signal(const int signum)
{
struct sigaction act;
memset(&act, 0, sizeof act);
sigemptyset(&act.sa_mask);
act.sa_sigaction = forward_handler;
act.sa_flags = SA_SIGINFO | SA_RESTART;
if (sigaction(signum, &act, NULL))
return errno;
return 0;
}
int main(int argc, char *argv[])
{
int status;
pid_t p, r;
if (argc < 2 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
fprintf(stderr, "\n");
fprintf(stderr, "Usage: %s [ -h | --help ]\n", argv[0]);
fprintf(stderr, " %s COMMAND [ ARGS ... ]\n", argv[0]);
fprintf(stderr, "\n");
return EXIT_FAILURE;
}
/* Install signal forwarders. */
if (forward_signal(SIGINT) ||
forward_signal(SIGHUP) ||
forward_signal(SIGTERM) ||
forward_signal(SIGQUIT) ||
forward_signal(SIGUSR1) ||
forward_signal(SIGUSR2)) {
fprintf(stderr, "Cannot install signal handlers: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
p = fork();
if (p == (pid_t)-1) {
fprintf(stderr, "Cannot fork(): %s.\n", strerror(errno));
return EXIT_FAILURE;
}
if (!p) {
/* Child process. */
execvp(argv[1], argv + 1);
fprintf(stderr, "%s: %s.\n", argv[1], strerror(errno));
return EXIT_FAILURE;
}
/* Parent process. Ensure signals are reflected. */
set_child_pid(p);
/* Wait until the child we created exits. */
while (1) {
status = 0;
r = waitpid(p, &status, 0);
/* Error? */
if (r == -1) {
/* EINTR is not an error. Occurs more often if
SA_RESTART is not specified in sigaction flags. */
if (errno == EINTR)
continue;
fprintf(stderr, "Error waiting for child to exit: %s.\n", strerror(errno));
status = EXIT_FAILURE;
break;
}
/* Child p exited? */
if (r == p) {
if (WIFEXITED(status)) {
if (WEXITSTATUS(status))
fprintf(stderr, "Command failed [%d]\n", WEXITSTATUS(status));
else
fprintf(stderr, "Command succeeded [0]\n");
} else
if (WIFSIGNALED(status))
fprintf(stderr, "Command exited due to signal %d (%s)\n", WTERMSIG(status), strsignal(WTERMSIG(status)));
else
fprintf(stderr, "Command process died from unknown causes!\n");
break;
}
}
/* This is a poor hack, but works in many (but not all) systems.
Instead of returning a valid code (EXIT_SUCCESS, EXIT_FAILURE)
we return the entire status word from the child process. */
return status;
}
Compile it using e.g.
gcc -Wall -O2 example.c -o example
and run using e.g.
./example sqlite3
You'll notice that Ctrl+C does not interrupt sqlite3
-- but then again, it does not even if you were to run sqlite3
directly --; instead, you just see ^C
on screen. This is because sqlite3
sets up the terminal in such a way that Ctrl+C does not cause a signal, and is just interpreted as normal input.
You can exit from sqlite3
using the .quit
command, or pressing Ctrl+D at the start of a line.
You'll see that the original program will output a Command ... []
line afterwards, before returning you to the command line. Thus, the parent process is not killed/harmed/bothered by the signals.
You can use ps f
to look at a tree of your terminal processes, and that way find out the PIDs of the parent and child processes, and send signals to either one to observe what happens.
Note that because SIGSTOP
signal cannot be caught, blocked, or ignored, it would be nontrivial to reflect the job control signals (as in when you use Ctrl+Z). For proper job control, the parent process would need to set up a new session and a process group, and temporarily detach from the terminal. That too is quite possible, but a bit beyond the scope here, as it involves quite detailed behaviour of sessions, process groups, and terminals, to manage correctly.
Let's deconstruct the above example program.
The example program itself first installs some signal reflectors, then forks a child process, and that child process executes the command sqlite3
. (You can speficy any executable and any parameters strings to the program.)
The internal_child_pid
variable, and set_child_pid()
and get_child_pid()
functions, are used to manage the child process atomically. The __atomic_store_n()
and __atomic_load_n()
are compiler-provided built-ins; for GCC, see here for details. They avoid the problem of a signal occurring while the child pid is only partially assigned. On some common architectures this cannot occur, but this is intended as a careful example, so atomic accesses are used to ensure only a completely (old or new) value is ever seen. We could avoid using these completely, if we blocked the related signals temporarily during the transition instead. Again, I decided the atomic accesses are simpler, and might be interesting to see in practice.
The forward_handler()
function obtains the child process PID atomically, then verifies it is nonzero (that we know we have a child process), and that we are not forwarding a signal sent by the child process (just to ensure we don't cause a signal storm, the two bombarding each other with signals). The various fields in the siginfo_t
structure are listed in the man 2 sigaction
man page.
The forward_signal()
function installs the above handler for the specified signal signum
. Note that we first use memset()
to clear the entire structure to zeros. Clearing it this way ensures future compatibility, if some of the padding in the structure is converted to data fields.
The .sa_mask
field in the struct sigaction
is an unordered set of signals. The signals set in the mask are blocked from delivery in the thread that is executing the signal handler. (For the above example program, we can safely say that these signals are blocked while the signal handler is run; it's just that in multithreaded programs, the signals are only blocked in the specific thread that is used to run the handler.)
It is important to use sigemptyset(&act.sa_mask)
to clear the signal mask. Simply setting the structure to zero does not suffice, even if it works (probably) in practice on many machines. (I don't know; I haven't even checked. I prefer robust and reliable over lazy and fragile any day!)
The flags used includes SA_SIGINFO
because the handler uses the three-argument form (and uses the si_pid
field of the siginfo_t
). SA_RESTART
flag is only there because the OP wished to use it; it simply means that if possible, the C library and the kernel try to avoid returning errno == EINTR
error if a signal is delivered using a thread currently blocking in a syscall (like wait()
). You can remove the SA_RESTART
flag, and add a debugging fprintf(stderr, "Hey!\n");
in a suitable place in the loop in the parent process, to see what happens then.
The sigaction()
function will return 0 if there is no error, or -1
with errno
set otherwise. The forward_signal()
function returns 0 if the forward_handler
was assigned successfully, but a nonzero errno number otherwise. Some do not like this kind of return value (they prefer just returning -1 for an error, rather than the errno
value itself), but I'm for some unreasonable reason gotten fond of this idiom. Change it if you want, by all means.
Now we get to main()
.
If you run the program without parameters, or with a single -h
or --help
parameter, it'll print an usage summary. Again, doing this this way is just something I'm fond of -- getopt()
and getopt_long()
are more commonly used to parse command-line options. For this kind of trivial program, I just hardcoded the parameter checks.
In this case, I intentionally left the usage output very short. It would really be much better with an additional paragraph about exactly what the program does. These kinds of texts -- and especially comments in the code (explaining the intent, the idea of what the code should do, rather than describing what the code actually does) -- are very important. It's been well over two decades since the first time I got paid to write code, and I'm still learning how to comment -- describe the intent of -- my code better, so I think the sooner one starts working on that, the better.
The fork()
part ought to be familiar. If it returns -1
, the fork failed (probably due to limits or some such), and it is a very good idea to print out the errno
message then. The return value will be 0
in the child, and the child process ID in the parent process.
The execlp()
function takes two arguments: the name of the binary file (the directories specified in the PATH environment variable will be used to search for such a binary), as well as an array of pointers to the arguments to that binary. The first argument will be argv[0]
in the new binary, i.e. the command name itself.
The execlp(argv[1], argv + 1);
call is actually quite simple to parse, if you compare it to the above description. argv[1]
names the binary to be executed. argv + 1
is basically equivalent to (char **)(&argv[1])
, i.e. it is an array of pointers that start with argv[1]
instead of argv[0]
. Once again, I'm simply fond of the execlp(argv[n], argv + n)
idiom, because it allows one to execute another command specified on the command line without having to worry about parsing a command line, or executing it through a shell (which is sometimes downright undesirable).
The man 7 signal
man page explains what happens to signal handlers at fork()
and exec()
. In short, the signal handlers are inherited over a fork()
, but reset to defaults at exec()
. Which is, fortunately, exactly what we want, here.
If we were to fork first, and then install the signal handlers, we'd have a window during which the child process already exists, but the parent still has default dispositions (mostly termination) for the signals.
Instead, we could just block these signals using e.g. sigprocmask()
in the parent process before forking. Blocking a signal means it is made to "wait"; it will not be delivered until the signal is unblocked. In the child process, the signals could stay blocked, as the signal dispositions are reset to defaults over an exec()
anyway. In the parent process, we could then -- or before forking, it does not matter -- install the signal handlers, and finally unblock the signals. This way we would not need the atomic stuff, nor even check if the child pid is zero, since the child pid will be set to its actual value well before any signal can be delivered!
The while
loop is basically just a loop around the waitpid()
call, until the exact child process we started exits, or something funny happens (the child process vanishes somehow). This loop contains pretty careful error checking, as well as the correct EINTR
handing if the signal handlers were to be installed without the SA_RESTART
flags.
If the child process we forked exits, we check the exit status and/or reason it died, and print a diagnostic message to standard error.
Finally, the program ends with a horrible hack: instead of returning EXIT_SUCCESS
or EXIT_FAILURE
, we return the entire status word we obtained with waitpid when the child process exited. The reason I left this in, is because it is sometimes used in practice, when you want to return the same or as similar exit status code as a child process returned with. So, it's for illustration. If you ever find yourself to be in a situation when your program should return the same exit status as a child process it forked and executed, this is still better than setting up machinery to have the process kill itself with the same signal that killed the child process. Just put a prominent comment there if you ever need to use this, and a note in the installation instructions so that those who compile the program on architectures where that might be unwanted, can fix it.
这篇关于使用SIGTERM在子进程上调用kill会终止父进程,但使用SIGKILL调用它会使父进程保持活动状态的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!