Linux,waitpid,WNOHANG,子进程,僵尸 [英] Linux, waitpid, WNOHANG, child process, zombie

查看:307
本文介绍了Linux,waitpid,WNOHANG,子进程,僵尸的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

for (; 1;) {
  if (fork() == 0) break;
  int sig = 0;
  for (; 1; usleep(10000)) {
    pid_t wpid = waitpid(g->pid[1], &sig, WNOHANG);
    if (wpid > 0) break;
    if (wpid < 0) print("wait error: %s\n", strerror(errno));
  }
}

但是当子进程被-9信号杀死时,子进程将进入僵尸进程.

waitpid应该立即返回子进程的pid!
但是waitpid在大约90秒后获得了pid号,

But when child process being killed with -9 signal, the child process goes to zombie process.

waitpid should return the pid of child process immediately!
But waitpid got the pid number after about 90 seconds,

cube     28139  0.0  0.0  70576   900 ?        Ss   04:24   0:07 ./daemon -d
cube     28140  9.3  0.0      0     0 ?        Zl   04:24 106:19 [daemon] <defunct>

这是父亲的痕迹

父亲没有被困住,总是叫wait4.

strace -p 28139
Process 28139 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>) = 0
wait4(28140, 0x7fff08a2681c, WNOHANG, NULL) = 0
nanosleep({0, 10000000}, NULL)          = 0
wait4(28140, 0x7fff08a2681c, WNOHANG, NULL) = 0

大约90秒后,父亲得到了SIGCHILD,而wait4返回了死去的孩子的pid.

--- SIGCHLD (Child exited) @ 0 (0) ---
restart_syscall(<... resuming interrupted call ...>) = 0
wait4(28140, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGKILL}], WNOHANG, NULL) = 28140

为什么子进程不会立即退出?相反,它意外地变成了僵尸.

推荐答案

我终于发现在使用lsof进行深度跟踪时,有一些fd泄漏.

I finally find out there were some fd leaks during deep tracing by lsof.

解决了fd泄漏后,问题就消失了.

After fd leaks were fixed, the problem was gone.

这篇关于Linux,waitpid,WNOHANG,子进程,僵尸的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆