僵尸进程,我们又来了 [英] Zombie processes, here we go again

查看:106
本文介绍了僵尸进程,我们又来了的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在多处理/线程/子处理方面苦苦挣扎.我基本上要做的是执行计算机上可用的每个二进制文件,我编写了一个 python 脚本来执行此操作.但是我一直有僵尸进程(不存在"),如果我的所有 4 个工人都处于这种状态,则最终会陷入僵局.我尝试了很多不同的东西,但似乎没有任何效果:(

I'm struggling a lot with multiprocessing/threading/subprocessing. What I'm basically trying to do is to execute every single binary available on my computer, I wrote a python script to do so. But I keep having zombie processes ("defunct"), which end up in a deadlock if all 4 of my workers are in this state. I tried lots of different things, but nothing seems to do it :(

架构如下:

|   \_ python -m dataset --generate
|       \_ worker1
|       |   \_ [thread1] firejail bin1
|       \_ worker2
|       |   \_ [thread1] firejail bin1
|       |   \_ [thread2] firejail bin2
|       |   \_ [thread3] firejail bin3
|       \_ worker3
|       |   \_ [thread1] [firejail] <defunct>
|       \_ worker4
|       |   \_ [thread1] [firejail] <defunct>

我创建了 4 个工人:

There are 4 workers that I create as such :

# spawn mode prevents deadlocks https://codewithoutrules.com/2018/09/04/python-multiprocessing/
with get_context("spawn").Pool() as pool:

    results = []

    for binary in binaries:
        result = pool.apply_async(legit.analyse, args=(binary,),
                                  callback=_binary_analysis_finished_callback,
                                  error_callback=error_callback)
        results.append(result)

(注意我使用了生成"池,但现在我想知道它是否有任何用处...)

(Note I use a "spawn" pool, but now I'm wondering if it's of any use...)

每个worker会像这样创建多个线程:

Each worker will create multiple threads like this :

threads = []
executions = []

def thread_wrapper(*args):
    flows, output, returncode = _exec_using_firejail(*args)
    executions.append(Execution(*args, flows, is_malware=False))

for command_line in potentially_working_command_lines:
    thread = Thread(target=thread_wrapper, args=(command_line,))
    threads.append(thread)
    thread.start()

for thread in threads:
    thread.join()

每个线程都会在firejail沙箱中启动一个新进程:

And each thread will start a new process in the firejail sandbox :

process = subprocess.Popen(FIREJAIL_COMMAND +
                           ["strace", "-o", output_filename, "-ff", "-xx", "-qq", "-s", "1000"] + command_line,
                           stdout=subprocess.PIPE, stderr=subprocess.PIPE, preexec_fn=os.setsid)

try:
    out, errs = process.communicate(timeout=5, input=b"Y\nY\nY\nY\nY\nY\nY\nY\nY\nY\nY\nY\nY\nY\nY\nY\n")
    # print("stdout:", out)
    # print("stderr:", errs)

except subprocess.TimeoutExpired:
    # print(command_line, "timed out")
    os.killpg(os.getpgid(process.pid), signal.SIGKILL)
    out, errs = process.communicate()

我使用 os.killpg() 而不是 process.kill() 因为出于某些原因,我的 Popen 进程的子进程没有被杀死......这要归功于 preexec_fn=os.setsid 设置所有后代的 gid.但是即使使用这种方法,某些进程(例如 zsh)也会引发僵尸进程,因为看起来 zsh 更改了其 gid,因此我的 os.killpg 无法按预期工作...

I use os.killpg() and not process.kill() because for some reasons subprocesses of my Popen process are not killed... This is possible thanks to preexec_fn=os.setsid which sets the gid of all descendants. But even with this method, some processes such as zsh will provoke a zombie process because it looks like zsh changes its gid and so my os.killpg doesn't work as expected...

我正在寻找一种 100% 确定所有进程都将死的方法.

I'm looking for a way to be a 100% percent sure all processes will be dead.

推荐答案

如果你想为此使用 subprocess 模块,你应该使用 .kill 方法直接使用 process 对象而不是使用 os 模块.使用 communicate 是一个阻塞操作;所以 Python 会等到响应.使用 timeout 参数有帮助,但对于很多进程来说会很慢.

If you want to use the subprocess module for this, you should use the .kill method of the process object directly instead of using the os module. Using communicate is a blocking action; so Python will wait until for a response. Using the timeout parameter helps, but will be slow for lots of processes.

import subprocess

cmd_list = (
    FIREJAIL_COMMAND 
    + ["strace", "-o", output_filename, "-ff", "-xx", "-qq", "-s", "1000"] 
    + command_line
) 
proc = subprocess.Popen(
    cmd_list,
    stdout=subprocess.PIPE, 
    stderr=subprocess.PIPE, 
    preexec_fn=os.setsid
)

try:
    out, errs = proc.communicate(timeout=5, input=b"Y\n" * 16)
except subprocess.TimeoutExpired:
    proc.kill()
    out, errs = None, None

ret_code = process.wait()

如果你想在一组进程的非阻塞循环中运行它,那就是当你使用 poll 时.这是一个例子.这假设您有一个 filenames 和相应的 command_lines 列表,您要提供给流程创建.

If you want to run it in a non-blocking loop over a set of processes, that is when you use poll. Here is an example. This assumes you have a list of filenames and corresponding command_lines that you want to feed to the process creation.

import subprocess
import time

def create_process(output_filename, command_line):
    cmd_list = (
        FIREJAIL_COMMAND 
        + ["strace", "-o", output_filename, "-ff", "-xx", "-qq", "-s", "1000"] 
        + command_line
    ) 
    proc = subprocess.Popen(
        cmd_list,
        stdout=subprocess.PIPE, 
        stderr=subprocess.PIPE, 
        preexec_fn=os.setsid
    )
    return {proc: (output_filename, command_line)}

processes = [create_process for f, c in zip(filenames, command_lines)]

TIMEOUT = 5
WAIT = 0.25  # how long to wait between checking the processes
finished = []
for _ in range(round(TIMEOUT / WAIT)):
    finished_new = []
    if not processes:
        break
    for proc in processes:
        if proc.poll():
            finished_new.append(proc)
    # cleanup
    for proc in finished_new:
        process.remove(proc)
    finished.extend(finished_new)
    time.sleep(WAIT)

这篇关于僵尸进程,我们又来了的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆