为什么cron产生的进程最终消失了? [英] Why do processes spawned by cron end up defunct?

查看:262
本文介绍了为什么cron产生的进程最终消失了?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些进程在 top (和)中显示为< defunct> ps )。我把事情从真实的脚本和程序中简化了。

I have some processes showing up as <defunct> in top (and ps). I've boiled things down from the real scripts and programs.

在我的 crontab

* * * * * /tmp/launcher.sh /tmp/tester.sh

launcher.sh (当然标记为可执行)的内容:

The contents of launcher.sh (which is of course marked executable):

#!/bin/bash
# the real script does a little argument processing here
"$@"

tester.sh 的内容(当然标记为可执行) ):

The contents of tester.sh (which is of course marked executable):

#!/bin/bash
sleep 27 & # the real script launches a compiled C program in the background

ps启动一个C程序显示以下内容:

user       24257 24256  0 18:32 ?        00:00:00 [launcher.sh] <defunct>
user       24259     1  0 18:32 ?        00:00:00 sleep 27

请注意, tester.sh 不出现-启动后台作业后退出。

Note that tester.sh does not appear--it has exited after launching the background job.

为什么 launcher.sh 坚持,标记为< defunct> ?似乎只有在 cron 启动时才这样做-而不是在我自己运行时。

Why does launcher.sh stick around, marked <defunct>? It only seems to do this when launched by cron--not when I run it myself.

附加说明: launcher.sh 是运行该系统的常见脚本,不容易修改。其他事情( crontab tester.sh ,甚至是我运行的程序而不是睡眠)可以更容易地修改。

Additional note: launcher.sh is a common script in the system this runs on, which is not easily modified. The other things (crontab, tester.sh, even the program that I run instead of sleep) can be modiified much more easily.

推荐答案

因为它们不是主题 wait(2)系统调用。

Because they haven't been the subject of a wait(2) system call.

由于将来有人可能会等待这些进程,因此内核可以不能完全摆脱它们,否则它将无法执行 wait 系统调用,因为它不再具有退出状态或存在的证据。

Since someone may wait for these processes in the future, the kernel can't completely get rid of them or it won't be able to execute the wait system call because it won't have the exit status or evidence of its existence any more.

从shell启动时,shell会捕获SIGCHLD并进行各种等待操作,因此长期没有失效。

When you start one from the shell, your shell is trapping SIGCHLD and doing various wait operations anyway, so nothing stays defunct for long.

但是cron并没有处于等待状态,它正在睡觉,所以这个已经去世的孩子可能会呆一会儿,直到cron醒来。

But cron isn't in a wait state, it is sleeping, so the defunct child may stick around for a while until cron wakes up.

更新: 回应评论...
嗯。我确实设法解决了这个问题:

Update:   Responding to comment... Hmm. I did manage to duplicate the issue:

 PPID   PID  PGID  SESS COMMAND
    1  3562  3562  3562 cron
 3562  1629  3562  3562  \_ cron
 1629  1636  1636  1636      \_ sh <defunct>
    1  1639  1636  1636 sleep

所以,发生了什么事,我认为:

So, what happened was, I think:


  • cron fork和cron child启动shell

  • shell(1636)启动sid和pgid 1636并开始睡眠

  • 外壳退出,SIGCHLD发送到cron 3562

  • 信号被忽略或处理不正确

  • shell变成了僵尸。请注意,睡眠是与init关联的,因此当睡眠退出init时,它将获得信号并进行清理。我仍在尝试找出僵尸何时收割。可能没有活跃的孩子,cron 1629认为它可以退出,到那时僵尸将被重新初始化并获得收割。因此,现在我们想知道cron应该处理的缺少SIGCHLD。
    • cron forks and cron child starts shell
    • shell (1636) starts sid and pgid 1636 and starts sleep
    • shell exits, SIGCHLD sent to cron 3562
    • signal is ignored or mishandled
    • shell turns zombie. Note that sleep is reparented to init, so when the sleep exits init will get the signal and clean up. I'm still trying to figure out when the zombie gets reaped. Probably with no active children cron 1629 figures out it can exit, at that point the zombie will be reparented to init and get reaped. So now we wonder about the missing SIGCHLD that cron should have processed.
      • It isn't necessarily vixie cron's fault. As you can see here, libdaemon installs a SIGCHLD handler during daemon_fork(), and this could interfere with signal delivery on a quick exit by intermediate 1629

        Now, I don't even know if vixie cron on my Ubuntu system is even built with libdaemon, but at least I have a new theory. :-)

      这篇关于为什么cron产生的进程最终消失了?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆