python多处理守护程序中的僵尸进程 [英] Zombie process in python multiprocessing daemon

查看:329
本文介绍了python多处理守护程序中的僵尸进程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

研究了python守护程序之后,这一遍历似乎是最可靠的: http: //www.jejik.com/articles/2007/02/a_simple_unix_linux_daemon_in_python/

After researching python daemons, this walk through seemed to be the most robust: http://www.jejik.com/articles/2007/02/a_simple_unix_linux_daemon_in_python/

现在,我正在尝试在守护进程类中实现一个我认为正在工作的工作者池(我尚未彻底测试代码),只是在关闭时我得到了一个僵尸进程.我已经读过我需要等待孩子返回的代码,但是我仍然无法确切知道我该怎么做.

Now I am trying to implement a pool of workers inside the daemon class which I believe is working (I have not thoroughly tested the code) except that on the close I get a zombie process. I have read I need to wait for the return code from the child but I just cannot see exactly how I need to do this yet.

以下是一些代码段:

def stop(self):
    ...
    try:
        while 1:
            self.pool.close()
            self.pool.join()
            os.kill(pid, SIGTERM)
            time.sleep(0.1)
    ...

在这里,我尝试了os.killpg和许多os.wait方法,但是没有任何改进.在os.kill之前和之后,我还玩过closing/joining池.就这样循环,永无止境,一旦到达os.kill,我就会遇到僵尸进程. self.pool = Pool(processes=4)出现在守护程序的__init__部分中.从start(self)之后执行的run(self)中,我将调用self.pool.apply_async(self.runCmd, [cmd, 10], callback=self.logOutput).但是,我想先研究一下僵尸进程.

Here I have tried os.killpg and a number of os.wait methods but with no improvement. I also have played with closing/joining the pool before and after the os.kill. This loop as it stands, never ends and as soon as it hits the os.kill I get a zombie process. self.pool = Pool(processes=4) occurs in the __init__ section of the daemon. From the run(self) which is excecuted after start(self), I will call self.pool.apply_async(self.runCmd, [cmd, 10], callback=self.logOutput). However, I wanted to address this zombie process before looking into that.

如何正确实现后台驻留程序中的池以避免这种僵尸进程?

How can I properly implement the pool inside the daemon to avoid this zombie process?

推荐答案

如果不知道子进程/守护进程中正在发生的事情,就不可能对答案有100%的信心,但请考虑是否可以.由于子进程中有工作线程,因此实际上您需要构建一些逻辑以在收到SIGTERM后加入所有这些线程.否则,您的进程可能不会退出(即使退出,您也可能无法正常退出).为此,您需要:

It is not possible to have 100% confidence in an answer without knowing what is going on in the child/daemon process, but consider if this could be it. Since you have worker threads in your child process, you actually need to build in some logic to join all of those threads once you receive the SIGTERM. Otherwise your process may not exit (and even if it does you may not exit gracefully). To do this you need to:

  • 编写一个要在子进程/守护进程中使用的信号处理程序,该处理程序捕获SIGTERM信号并触发您的主线程的事件
  • 将信号处理程序安装在子进程/守护进程的主线程中(非常重要)
  • SIGTERM的事件处理程序必须向子进程/守护进程中的所有线程发出停止指令
  • 所有线程都必须在完成后加入join()(如果您假设SIGTERM会自动销毁您也必须实现此逻辑的所有内容)
  • 一旦所有内容都加入并清理了,就可以退出主线程

如果您具有用于I/O和各种事物的线程,那么这将是一件实事.

If you have threads for I/O and all kinds of things then this will be a real chore.

此外,我通过实验发现,当您使用信号处理程序时,事件侦听器的特定策略很重要.例如,如果使用select.select(),则必须使用超时,如果发生超时,则重试;否则,请执行以下操作.否则,您的信号处理程序将无法运行.如果您有用于事件的Queue.Queue对象,并且事件侦听器调用其.get()方法,则必须使用超时,否则信号处理程序将不会运行. (在VM内以C语言实现的真实"信号处理程序可以运行,但除非使用超时,否则Python信号处理程序不会运行.)

Also, I have found through experiment that the particular strategy for your event listener matters when you are using signal handlers. For example, if you use select.select() you must use a time-out and retry if the time-out occurs; otherwise your signal handler will not run. If you have a Queue.Queue object for events, and your event listener calls its .get() method, you must use a timeout, otherwise your signal handler will not run. (The "real" signal handler implemented in C within the VM runs, but your Python signal handler doesn't unless you use timeouts.)

祝你好运!

这篇关于python多处理守护程序中的僵尸进程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆